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CROSS REFERENCE TO RELATED APPLICATION 

This application is a continuation of POT Application No. PCT/CAOO/0587, 
filed May 1 2, 2000, which ciainis priority from U.S. Provisional Application No. 60/1 34,259, 
filed May 14, 1999, the entire contents of both of which are hereby incorporated by 



I relates to protein-protein interactions and methods for identifying interacting proteins and 
' 15 the amino acid sequence at the site of interaction. 



BACKGROUND OF THE INVENTION 

Specific protein-protein interactions are critical events in biological processes. 
Protein-protein interactions govern biological processes that handle cellular information 

20 flow and control cellular decisions (e.g., signal transduction, cell cycle regulation and 
assembly of cellular structures). The entire network of interactions between cellular proteins 
is a biological chart of functional events that regulate the internal working of living 
organisms and their responses to external signals. A necessary step for the completion 
of this biological interaction chart is the knowledge of all the gene sequences in a given 

25 living organism. The entire DNA sequence of the Homo sapiens genome will be 
completed at the latest by the year 2003. Unfortunately, the sequence of a gene does not 
reveal its biological function nor its position in the biological chart. Given the expected 
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FIELD OF THE INVENTION 



The present invention relates to proteonomics. More specifically, the invention 
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number of proteins in the human genome (80,000 to 120,000), the mapping of the 
biological chart of protein-protein interactions will be an enormous, but rewarding, task. 

During the past few decades, several techniques have been developed to 
determine the interactions between proteins. These techniques include: i) physical 
5 methods to select and detect interacting proteins {e.g., protein affinity chromatography, co- 
immunoprecipitation, crosslinking, and affinity blotting); ii) Library based methods {e.g., 
phage display and two-hybrid systems); and iii) genetic methods {e.g., overproduction 
phenotype, synthetic lethal effects, and unlinked noncompiementation). Of the above 
mentioned methods for detecting protein-protein interactions, the two-hybrid systems are 
10 most popular and are most extensively used. In the classical two-hybrid system, 
^ transcription of reporter genes depends on an interaction between a DNA-bound "bait" 
© protein and an activation-domain containing "prey" protein. The two hybrid systems 
cjl unfortunately may suffer from a number of disadvantages. Fore>»mple,the interaction of 
y* proteins is monitored in tlie nuclear milieu rather than the cytoplasm where most proteins 
O 1 5 are found and it does not allow the simultaneous Identification of the precise amino acid 
1^, sequences between two interacting proteins and cannot be easily applied to different cell 
f °' types or tissues whereby different interacting proteins may be expressed. 
W It has been previously demonstrated that small synthetic peptides can bind to 

p! proteins (Adorini, L, 1 993, Clin. Exp. Rheumatol. 8:S41 -44; Chen et al., 1 997, Curr. Opin. 
20 Chem. Biol. 1:458-466; Hoogenboometal., 1998, Immunotechnology4:1-20. Klemmet 
al., 1 998, Ann. Rev. Immunol. 16:569-592; Stanfield et al., 1 995, Curr. Opin. Struct. Biol. 
5:103-1 13). Nevertheless, the use of synthetic peptides in a systematic approach to 
identify interacting protein domains and sequences has not been proposed or provided. 
Certain signature domains have been shown to bind with high affinity to specific peptide 
25 sequences {e.g., the Src homology-2 or SH2 domain of Src-family kinases bind tightly to 
a phosphorylated tyrosine (Y*-EEI) sequence found in epidemial growth factor receptor and 
the focal adhesion kinase) (Kuriyan et al. , 1 997, Ann. Rev. Biophys. Biomol. Struct. 26:259- 
288). 
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There thus remains a need to provide a method which enables identification 
of i) the exact amino acid sequences of at least one binding partner between interacting 
proteins; ii) numerous, possibly all interacting proteins in different cells or tissues; and Hi) 
the specific domains (or sequences) between two interacting proteins as targets for 
5 isolation of lead drugs. In addition, there remains a need to provide methods and assays 
which enable the identification of the precise amino acid sequence of interacting domains 
of proteins which is significantly faster than conventional methods {e.g., days instead of 
months). 

The present invention seeks to meet these and other needs. 
1 0 The present description refers to a number of documents, the content of which 

is herein incorporated by reference, in their entirety. 



n SUMMARY OF THE INVENTION 

.J s j; 



ji; The present invention seeks to overcome the drawbacks of the prior art. iVIore 

:? 15 specifically, the invention concerns an approach to identify protein-protein interaction 
domains which differ from the prior art. Moreover, one approach ofthe present invention 



f\ is based on an understanding ofthe principle that govern protein-protein interactions. Such 
Q understanding, therefore, allows the use of several methods. Such a method is exemplified 
in detail below to identify: i) at least one of the exact amino acid sequences between 
20 interacting proteins; ii) a number of, possibly all, interacting proteins in different cells or 
tissues; and iii) the specific domains (or sequences) between two interacting proteins as 
targets for isolation of lead drugs. Preferably, the method and assay of the present 
invention enables a determination of i), ii) and iii). Moreover, unlike the approaches ofthe 
prior art, the method described herein allows for the identification of interacting proteins 
25 and the precise amino acid sequences of interactions in several days as opposed to 
several months. 

The ability to select proteins (or other molecules) that block interactions 
between a gene product and some partners but not others, should allow sophisticated 
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modulation of cellular signaling or cell metabolism in human cells and other currently 
intractable systems. Indeed, the identification of proteins that interact with a therapeutically 
important protein and the identification of the sites of interaction may be more relevant to 
drug development than other genetic approaches such as "knock-out"s. The latter 
addresses the phenotypic consequences of disrupting all of the interactions in which a 
given protein is involved, as opposed to inhibiting the interaction of one protein (at the 
worse, a few proteins as opposed to all) in a multimeric complex. 

The present invention further relates to a novel approach in drug discovery. A 
major obstacle in drug development for the treatment of diseases has been the 
identification of target proteins and their functional sites. In fact, most research and 
development (R&D) projects in pharmaceutical companies take several years to identify 
a valid target protein. The selection of drugs that bind to and inhibit the functions of these 
proteins takes several years and is generally non-specific and random. Furthermore, drugs 
identified by current approaches often target the active sites in proteins. Such drugs thus 
often lead to major side-effects. Therefore, it is not surprising that many R&D projects 
never lead to the development of specific drugs even after three to five years of intensive 
research efforts. The methods and assays to identify protein-protein interactions of the 
present invention addresses three important steps in the development of drugs: 

1) the identification of the amino acid sequences of all interacting domains in target 
proteins; 

2) the identification of a set of interacting proteins (preferably all interacting proteins) for 
drug development; and 

3) screening for specific drugs against each of the interacting domains in a target protein. 

P-glycoprotein (P-gp) has been shown to cause multidrug resistance in tumor 
cell lines selected with lipophilic anticancer drugs. Analysis of P-gp amino acid sequence 
has lead to a proposed model of a duplicated molecule with two hydrophobic and 
hydrophilic domains linked by a highly charged region of about 90 amino acids, the linker 
domain. Although similarly charged domains are found in other members of the P-gp 



superfamily, the function(s) of this domain are not known. Herein, it is demonstrated using 
the method of the present invention that this domain binds to other cellular proteins. Using 
overlapping hexapeptides that span the entire amino acid sequences of the linker domains 
of human P-glycoprotein gene 1 and 3 (HP-gpl and HP-gp3), a direct and specific binding 
5 between P-gp1 and 3 linker domains and intracellular proteins is shown herein. Three 
different stretches (^^^EKGIYFKLVTM^" (SEQ ID N0:1), 
^^^SRSSLIRKRSTRRSVRGSQA^^^ (SEQ ID N0:2), and ^^^PVSFWRIMKLNLT^^^ (SEQ 
ID N0:3) for P-gp1 and ^^^MKKEGVYFKLVNM^^' (SEQ ID N0:4), 
'^^KAATRMAPNGWKSRLFRHSTQKNLKNS^^' (SEQ ID N0:5), and 
10 ^^^PVSFLKVLKLNKT^^^(SEQ ID NO:6)forP-gp3) in linker domains specifically bound to 
J^; proteins with apparent molecular masses of --80 kDa, 57 kDa and 30 kDa. Interestingly, 
O only the 57 kDa protein was bound, to varying degrees, to the three different sequences in 
Q the linkerdomain. Moreover, the binding between the overlapping peptides encoding the 
pi linker sequence and the 57 kDa protein were resistant to the Zwitterionic detergent, 
0^ 15 CHAPS, but were sensitive to SDS. Purification and partial N-terminal amino acid 
|y, sequencing of the 57 kDa protein showed that it encodes the N-terminal amino acids of 
alpha and beta-tubulins. Further, Western blot analysis using monoclonal antibodies that 
binds to a- and p-tubulins confirmed the identity of the 57 kDa protein. Taken together, this 
{M, is the first example showing protein interactions with the P-gp linker domain. This may, of 
20 course, be important to the overall function of P-gp. More importantly, the results in this 
study demonstrate the novel concept whereby the interactions between two proteins are 
mediated by strings of few amino acids with high and repulsive binding energies. 

In accordance with one embodiment of the present invention, there is provided 
a method of identifying a high-affinity interacting domain in a chosen protein, domain 
25 thereof, or part thereof, and the amino acid sequence thereof comprising: a) providing a 
set of overlapping peptides spanning a complete sequence of the chosen protein, domain 
thereof, or part thereof, covalently bound to a support; b) providing a mixture of proteins 
and/or a mixture of peptides; c) incubating the set of overlapping peptides of a), with the 
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mixture of b), under conditions enabling the binding between a high-affinity interacting 
domain in a peptide of the set and one or more protein or peptide of b) to occur; d) 
washing of any protein-protein interaction which is not a high-affinity interaction of c); and 
e) identifying which peptide of a) interacts with high-affinity to a protein or peptide of b); 
5 thereby identifying the peptide of e) and the sequence thereof as a high-affinity interacting 
domain. 

In accordance with another aspect of the present invention, there is provided 
a method of identifying an agent which modulates an interaction between high-affinity 
interacting domains between a set of overlapping peptides spanning a complete sequence 

0 of a chosen protein, domain thereof, or part thereof, the set being covalently bound to a 
support and a mixture of proteins and/or a mixture of peptides comprising: a) incubating 
the set of overlapping peptides, with the mixture in a presence of at least one agent, under 
conditions enabling the binding between a high-affinity interacting domain in a peptide of 
the set and one or more protein or peptide of the mixture to occur; b) washing to remove 

5 any protein-protein interaction which is not a high-affinity interaction of a); and c) identifying 
which peptide of a) interacts with high-affinity to a protein or peptide of the mixture in a 
presence of the agent as compared to in an absence thereof; thereby identifying the agent 
as a modulator of the high-affinity interaction when the interaction in the presence of the 
agent is measurably different from the interaction in the absence thereof. 

iO In accordance with yet another aspect of the present invention, there is 

provided agents identified as modulators of the high-affinity protein interactions of the 
present invention. 

For the purpose of the present invention, the following abbreviations and terms 
are defined below. 

!5 

DEFINITIONS 

The terminology "overlapping peptides spanning a peptide sequence" (e.g., 
a domain, a full length protein sequence, or a part thereof) or the like refers to peptides of 
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a chosen size, based on the sequence of the protein (or part thereof). Preferably, these 
peptides are synthetic peptides. 

As explained hereinbelow, the size of the overlapping peptides has a 
significantimpactonthe workings of the present invention. For example, peptides of four 
5 contiguous amino acids appearto significantly increase the low affinity binding of proteins 
thereto. Moreover, the use of larger peptides, such as 20 amino acids or higher, would be 
expected to increase the proportion of repulsive amino acids to high affinity amino acids, 
thereby masking ortotally inhibiting the binding of specific proteins to the peptides. Thus, 
while the person of ordinary skill would understand that there are trade-offs associated with 
10 the choice of small peptides as opposed to larger ones, the preferred size for the 
overlapping peptides of the present invention is between 5 and 15 amino acids, or 
C5 between 5 and 12 amino acids, or between 5 and 10 amino acids, 
p The term "support" in the context of a support to which the overlapping peptides 

of the present invention are covalently bound, can be chosen from a multitude of supports 
15 found in the art. Such supports include CHIPS, plates (e.gf.,96-well plates), glass beads 
r and the like. TheCHIPtechnology is well-known in the art. References relating thereto 
^ include Deboucketal., Nat. Genet. 1999 Jan;21 (1 Suppl):48-50, Review; Brown eta!., Nat. 

Genet. 1999 Jan;21(1 Suppl):33-7, Review; Cheung et al., Nat Genet. 1999 Jan;21(1 
Q Suppl):15-9, Review; Duggan et al., Nat. Genet. 1999 Jan;21(1 Suppl):10-4, Review; 
20 Schenaetal., Trends Biotechnol. 1998 Jul;16(7):301-6, Review; and Ramsay eta!., Nat. 
Biotechnol. 1998 Jan;16(1):40-4, Review. 

Protein sequences are presented herein using the one letter or three letter 
amino acid symbols as commonly used in the art and in accordance with the 
recommendations of the lUPAC-lUB Biochemical Nomenclature Commission. 
25 Unless defined otherwise, the scientific and technological terms and 

nomenclature used herein have the same meaning as commonly understood by a person 
of ordinary skill to which this invention pertains. Generally, the procedures for cell cultures, 
infection, molecular biology methods and the like are common methods used in the art. 



7 



Such standard techniques can be found in reference manuals such as for example 
Sambrook et al. (1989, Molecular Cloning - A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, New York) and Ausubel et al. (1 994, Current Protocols 
in Molecular Biology, Wiley, New York). 

The present description refers mainly to proteins, of recombinant DNA (rDNA) 
technology terms. Non-limiting selected examples are provided for clarity and consistency. 

As used herein, "nucleic acid molecule", refers to a polymer of nucleotides. 
Non-limiting examples thereof include DNA {e.g., genomic DNA, cDNA) and RNA 
molecules {e.g., mRNA). The nucleic acid molecule can be obtained by cloning techniques 
or synthesized. DNA can be double-stranded or single-stranded (coding strand or non- 
coding strand (antisense)). 

The term "recombinant DNA" as known in the art refers to a non-natural DNA 
molecule resulting from the joining of DNA segments. This is often referred to as genetic 
engineering. 

The term "DNA segment" is used herein to refer to a DNA molecule comprising 
a linear stretch or sequence of nucleotides. This sequence, when read in accordance with 
the genetic code, can encode a linear stretch or sequence of amino acids which can be 
referred to as a polypeptide, protein, protein fragment {e.g., peptide) and the like. 

The terminology "amplification pair" refers herein to a pair of oligonucleotides 
(oligos) of the present invention, which are selected to be used together in amplifying a 
selected nucleic acid sequence by one of a number of types of amplification processes, 
preferably a polymerase chain reaction (PCR). Other types of amplification processes 
include llgase chain reaction, strand displacement amplification, or nucleic acid 
sequence-based amplification, as explained in greater detail below. As commonly known 
in the art, the oligos are designed to bind to a complementary sequence under selected 
conditions. 

The nucleic acid (e.g., DNA or RNA) for practicing the present Invention may 
be obtained according to well known methods. 

8 
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As used herein, the term "physiologically relevant" is meant to describe 
interactions which can take effect to modulate an activity or level of one or more proteins 
in their natural setting. 

The term "DNA" molecule or sequence (as well as sometimes the term 
5 "oligonucleotide") refers to a molecule comprised of the deoxyribonucleotides adenine (A), 
guanine (G), thymine (T) and/or cytosine (C), in a double-stranded form, and comprises or 
includes a "regulatory element" according to the present invention, as the term is defined 
herein. The term "oligonucleotide" or "DNA" encompasses linear DNA molecules or 
fragments, viruses, plasmids, vectors, chromosomes or synthetically derived DNA. As 
1 0 used herein, particular double-stranded DNA sequences may be described according to 
the normal convention of giving only the sequence in the 5' to 3' direction. 

O "Nucleic acid hybridization" refers generally to the binding of two 

O 

IM: single-stranded nucleic acid molecules having complementary base sequences, which 
ff^ under appropriate conditions will form a thermodynamically favored double-stranded 
1 5 structure. Examples of hybridization conditions can be found in the two laboratory manuals 
^ referred above (Sambrook et al., 1 989, supra and Ausubel et aL, 1 989, supra) and are 
commonly known in the art. In the case of a hybridization to a nitrocellulose filter, as for 
f'l example in the well known Southern blotting procedure, a nitrocellulose filter can be 
CI incubated overnight at 65°C with a labeled probe in a solution containing 50%formamide, 
' 20 high salt (5 x SSC or 5 x SSPE), 5 x Denhardt's solution. 1% SDS, and 100 \ig/m\ 
denatured carrier DNA {e.g., salmon sperm DNA). The non-specifically binding probe can 
then be washed off the filter by several washes in 0.2 x SSC/0. 1 % SDS at a temperature 
which is selected in view of the desired stringency: room temperature (low stringency), 
42^C (moderate stringency) or 65^C (high stringency). The selected temperature is based 
25 on the melting temperature (Tm) of the DNA hybrid. Of course, RNA-DNA hybrids can also 
be formed and detected. I n such cases, the conditions of hybridization and washing can be 
adapted according to well known methods by the person of ordinary skill. Stringent 
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conditionswill be preferably used (Sambrook etaL,1 989, si/pra) to ensure more complete 
and accurate hybridization. 

Probes for nucleic acids can be utilized with naturally occurring 
sugar-phosphate backbones as well as modified backbones including phosphorothioates, 
dithionates, alkyi phosphonatesand a-nucleotidesand the like. Modified sugar-phosphate 
backbones are generally taught by Miller, 1988, Ann. Reports Med. Chem. 23:295 and 
Moran et al., 1987, Nucleic Acids Res., 14:5019. Probes of the invention can be 
constructed of either ribonucleic acid (RNA) or deoxyribonucleic acid (DNA), and 
preferably of DNA. 

It is an advantage of the present invention that the detection of the interaction 
between proteins and/or peptides be dependent on a label. Such labels provide sensitivity 
and often enable automation. In one embodiment of the present invention, automation is 
performed using CHIP technology. For example, the overlapping peptides, spanning a 
chosen sequence of a protein, are bound to a CHIP which can then be used to automate 
atestfor interaction with proteins or peptides. Of course, it should be understood that the 
present invention isnotstrictly dependent on a design and synthesis of the overlapping set 
of peptides spanning a chosen protein sequence. Indeed, banks of peptides are available, 
from which this set of overlapping peptides could be constructed. 

Protein labelling is well-known in the art. A non-limiting example of labels 
includes ^H, ^"^C, ^^P, and ^^S. Non-limiting examples of detectable markers include 
ligands, fluorophores, chemiluminescent agents, enzymes, and antibodies. It will become 
evident to the person of ordinary skill that the choice of a particular label dictates the 
manner in which it is bound to the protein. 

The identification of the interaction is not specifically dependent on labelling of 
the proteins since, for example, this interaction could be assessed using proteomic 
approaches (such as 2-D gels and mass spectometry) or using a library of antibodies. 
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As commonly known, radioactive amino acids can be incorporated into 
peptides or proteins of the invention by several well-known methods. A non-limiting 
example thereof includes in vitro or in vivo labelling of proteins using ^^SMet. 

The term "vector" is commonly known in the art and defines a plasmid DNA, 
phage DNA, viral DNA and the like, which can serve as a DNA vehicle into which DNA of 
the present invention can be cloned. Numerous types of vectors exist and are well known 
in the art. 

The term "expression" defines the process by which a gene is transcribed into 
mRNA (transcription), the mRNA isthen being translated (translation) into one polypeptide 
(or protein) or more. 

The terminology "expression vector" defines a vector or vehicle as described 
above, but designed to enable the expression of an inserted sequence following 
transformation into a host. The cloned gene (inserted sequence) is usually placed underthe 
control of control element sequences such as promoter sequences. The placing of a 
cloned gene under such control sequences is often referred to as being operably linked to 
control elements or sequences. 

Operably linked sequences may also include two segments that are transcribed 
onto the same RN A transcript. Thus, two sequences, such as a promoter and a "reporter 
sequence" are operably linked if transcription commencing in the promoterwill produce an 
RNAtranscript of the reporter sequence. In orderto be "operably linked" it is not necessary 
that two sequences be immediately adjacent to one another. 

Expression control sequences will vary depending on whether the vector is 
designed to express the operably linked gene in a prokaryotic or eukaryotic host or both 
(shuttle vectors) and can additionally contain transcriptional elements such as enhancer 
elements, termination sequences, tissue-specificity elements, and/or translational initiation 
and termination sites. 

Prokaryotic expression vectors are useful forthe preparation of large quantities 
of the protein encoded by the DNA sequence of interest. This protein can be purified 
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according to standard protocols that take advantage of the intrinsic properties thereof, such 
as size and charge {e.g., SDS gel electrophoresis, gel filtration, centrifugation, ion 
exchange chromatography, etc.). In addition, the protein of interest can be purified via 
affinity chromatography using polyclonal or monoclonal antibodies. The purified protein is 
useful for therapeutic applications. 

The DNA construct can be a vector comprising a promoter that is operably 
linked to an oligonucleotide sequence of the present invention which, in turn, is operably 
linked to a heterologous gene, such as the gene for the luciferase reporter molecule. 
"Promoter" refers to a DNA regulatory region capable of binding directly or indirectly to 
RNA polymerase in a cell and initiating transcription of a downstream (3' direction) coding 
sequence. For purposes of the present invention, the promoter is bound at its 3' terminus 
by the transcription initiation site and extends upstream (5' direction) to include the 
minimum number of bases or elements necessary to initiate transcription at levels 
detectable above background. Within the promoter will be found a transcription initiation 
site (conveniently defined by mapping with S1 nuclease), as well as protein binding 
domains (consensus sequences) responsible for the binding of RNA polymerase. 
Eukaryotic promoters will often, but not always, contain "TATA" boxes and "CCAT" boxes. 
Prokaryotic promoters may contain Shine-Dalgarno sequences in addition to the -1 0 and 
-35 consensus sequences. 

As used herein, the designation "functional derivative" denotes, in the context 
of a functional derivative of a sequence, whether a nucleic acid or amino acid sequence, 
a molecule that retains a biological activity (either function or structural) that is substantially 
similar to that of the original sequence. This functional derivative or equivalent may be a 
natural derivative or may be prepared synthetically. Such derivatives include amino acid 
sequences having substitutions, deletions, or additions of one or more amino acids, 
provided that the biological activity of the protein is conserved. The same applies to 
derivatives of nucleic acid sequences which can have substitutions, deletions, or additions 
of one or more nucleotides, provided that the biological activity of the sequence is generally 
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maintained. When relating to a protein sequence, tlie substituting amino acid has chemico- 
physical properties which are similar to that of the substituted amino acid. The similar 
chemico-physical properties include similarities in charge, bulkiness, hydrophobicity, 
hydrophylicity and the like. The term "functional derivatives" is intended to include 
5 fragments, segments, variants, analogs, or chemical derivatives of the subject matter of the 
present invention. 

As well-known in the art, a "conservative mutation" or "substitution" of an amino 
acid refers to mutation or substitution which maintains: 1 ) the structure of the backbone of 
the polypeptide {e.g., a beta sheet or alpha-helical structure); 2) the charge or 

1 0 hydrophobicity of the amino acid; or 3) the bulkiness of the side chain. More specifically, 
the well-known terminologies "hydrophilic residues" relate to serine or threonine. 
"Hydrophobic residues" refer to leucine, isoleucine, phenylalanine, valine or alanine. 
"Positively charged residues" relate to lysine, arginine or hystidine. "Negatively charged 
residues" refer to aspartic acid or glutamic acid. Residues having "bulky side chains" refer 

1 5 to phenylalanine, tryptophan or tyrosine. 

Peptides, protein fragments, and the like in accordance with the present 
invention can be modified in accordance with well-known methods dependently or 
independently of the sequence thereof. For example, peptides can be derived from the 
wild-type sequence exemplified herein in the figures using conservative amino acid 

20 substitutions at 1 , 2, 3 or more positions. The terminology "conservative amino acid 
substitutions" is well-known in the art which relates to substitution of a particular amino acid 
by one having a similar characteristic (e.^f., aspartic acid for glutamic acid, or isoleucine 
for leucine). Of course, non-conservative amino acid substitutions can also be carried out, 
as well as other types of modifications such as deletions or insertions, provided that these 

25 modifications modify the peptide, in a suitable way {e.g., without affecting the biological 
activity of the peptide if this is what is intended by the modification). A list of exemplary 
conservative amino acid substitutions is given hereinbelow. 
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TABLE 2 
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As can be seen in this table, some of these modifications can be used to 
render the peptide more resistant to proteolysis. Of course, modifications of the peptides 
can also be effected without affecting the primary sequence thereof using an enzymatic or 
chemical treatment well-known in the art. 
5 The term "variant" refers herein to a protein or nucleic acid molecule which is 

substantially similar in structure and biological activity to the protein or nucleic acid of the 
present invention. 

The functional derivatives of the present invention can be synthesized 
chemically or produced through recombinant DNA technology using methods well known 
1 0 in the art. In one particular embodiment of the present invention, a variant according to the 
. ^ present invention is identified using a method of the present invention. It can also be 

y designed to formally test for the conservation of particular amino acids (e.g., by 

O 

U synthesizing a variant or mutant peptide) . These variants can also be tested as part of the 

•pi; 

full length sequence of the protein in orderto validate the interaction. Of course, the skilled 
12 1 5 artisan will understand that having identified a region of a chosen protein as a region which 
? is involved in high-affinity protein interaction(s) enables in vitro mutagenesis (or a testing 

of related peptide sequences) of this region to identify and dissect the structure/function 
fZ relation of this region. Such methods are well-known in the art. When the interaction 
p domains of two proteins have been identified, it is thus possible for the skilled artisan to 
20 identify and/or design variants having a modified affinity for an interacting protein. Of 

course, when both interacting sequences are known, very powerful questions can be asked 

to dissect the structure-function relationship which governs the high-affinity interaction 

between same. 

As used herein, "chemical derivatives" Is meant to cover additional chemical 
25 moietiesnotnormallypartofthesubjectmatteroftheinvention. Such moieties could affect 
the physico-chemical characteristic of the derivative {e.g., solubility, absorption, half life and 
the like, decrease of toxicity). Such moieties are exemplified in Remington's 
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Pharmaceutical Sciences (e.g., 1980). Methods of coupling these chemical-physical 
moieties to a polypeptide are well known in the art. 

The term "allele" defines an alternative form of a gene which occupies a given 
locus on a chromosome. 
5 As commonly known, a "mutation" is a detectable change in the genetic 

material which can be transmitted to a daughter cell. As well known, a mutation can be, for 
example, a detectable change in one or more deoxyribonucleotide. For example, 
nucleotides can be added, deleted, substituted for, inverted, or transposed to a new 
position. Spontaneous mutations and experimentally induced mutations exist. The result 

10 of mutations of a nucleic acid molecule is a mutant nucleic acid molecule. A mutant 
polypeptide can be encoded from this mutant nucleic acid molecule. 

As used herein, the term "purified" refers to a molecule having been separated 
from other cellular components. Thus, for example, a "purified protein" has been purified 
to a level not found in nature. A "substantially pure" molecule is a molecule that is lacking 

1 5 in most other cellular components. 

As used herein, the terms "molecule", "compound" and "ligand" are used 
interchangeably and broadly to refer to natural, synthetic or semi-synthetic molecules or 
compounds. The term "molecule" therefore denotes for example chemicals, 
macromolecules, cell or tissue extracts (from plants or animals) and the like. Non-limiting 

20 examples of molecules include nucleic acid molecules, peptides, antibodies, 
carbohydrates and pharmaceutical agents. The agents can be selected and screened by 
avariety of means including random screening, rational selection and by rational design 
using for example protein or ligand modelling methods such as computer modelling, 
combinatorial library screening and the like. The terms "rationally selected" or "rationally 

25 designed" are meant to define compounds which have been chosen based on the 
configuration of the interaction domains of the present invention. As will be understood by 
the person of ordinary skill , macromolecules having non-naturally occurring modifications 
are also within the scope of the term "molecule." For example, peptidomimetics, well 
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known in the pharmaceutical industry and generally referred to as peptide analogs, can be 
generated by modelling as mentioned above. Similarly, in a preferred embodiment, the 
polypeptides of the present invention are modified to enhance their stability, it should be 
understood that in most cases this modification should not alter the biological activity of the 
5 interaction domain. The molecules identified in accordance with the teachings of the 
present invention have a therapeutic value in diseases or conditions in which the physiology 
or homeostasis of the cell and/or tissue is compromised by a high-affinity protein 
interaction identified in accordance with the present invention. Alternatively, the molecules 
identified in accordance with the teachings of the present invention find utility in the 

10 development of more efficient agents which can modulate such interactions. 

Libraries of compounds (publicly available or commercially available, e.g., a 
combinatorial library) are well-known in the art. Libraries of peptides are also available. 
Such libraries can be used to build an overlapping set of peptide sequences spanning a 
chosen domain, protein or part thereof. 

1 5 As used herein, the recitation "indicator cells" refers to cells that express, in one 

particular embodiment, two interacting peptide domains of the present invention, and 
wherein an interaction between these proteins or interacting domains thereof is coupled 
to an identifiable or selectable phenotype or characteristic such that it provides an 
assessment or validation of the interaction between same. Such indicator cells can also 

20 be used in the screening assays of the present invention. In certain embodiments, the 
indicator cells have been engineered so as to express a chosen derivative, fragment, 
homolog, or mutant of these interacting domains. The cells are, for example, yeast cells 
or higher eukaryotic cells such as mammalian cells (WO 96/411 69). In one particular 
embodiment, the indicator cell is ayeast cell harboring vectors enabling the use of the two 

25 hybrid system technology, as well known in the art (Ausubel et al., 1 994, supra) and can be 
used to test a compound or a library thereof. In one embodiment, a reporter gene encoding 
a selectable marker or an assayable protein is operably linked to a control element such 
that expression of the selectable marker or assayable protein is dependent on the 
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interaction of the Protein A and Protein B interacting domains. Such an indicator cell is 
useful for rapidly screening at high-throughput a vast array of test molecules. In a particular 
embodiment, the reporter gene is luciferase or p-Gal. 

In one aspect, at least one of the two interacting proteins or domains of the 
present invention is provided as a fusion protein. The design of constructs therefor and the 
expression and production of fusion proteins are well known in the art (Sambrook et al., 
1989, supra] and Ausubel eta!., 1994, supra). In a particular embodiment, both interaction 
domains are part of fusion proteins. A non-limiting example of such fusion proteins 
includes a LexA-Protein A fusion (DNA-binding domain-Protein A; bait) and a B42-Protein 
Bfusion (transact! vator domain-Protein B; prey). In yet another particular embodiment, the 
LexA-Protein A and B42-Protein B fusion proteins are expressed in a yeast cell also 
harboring a reporter gene operably linked to a LexA operator and/or LexA responsive 
element. Of course, it will be recognized that other fusion proteins can be used in such two 
hybrid systems. Furthermore, it will be recognized that the fusion proteins need not contain 
the full-length interacting proteins. Indeed, fragments of these polypeptides, provided that 
they comprise the interacting domains, can be used in accordance with the present 
invention, as evidenced with the peptide spanning method of the present invention. 

Non-limiting examples of such fusion proteins include a hemaglutinin fusions, 
gluthione-S-transferase (GST) fusions and maltose binding protein (MBP) fusions. In 
certain embodiments, it is beneficial to introduce a protease cleavage site between the two 
polypeptide sequences which have been fused. Such protease cleavage sites between 
two heterologously fused polypeptides are well known in the art. 

In certain embodiments, it might also be beneficial to fuse the interaction 
domains of the present invention to signal peptide sequences enabling a secretion of the 
fusion protein from the host cell. Signal peptides from diverse organisms are well known 
in the art. Bacterial OmpA and yeast Suc2 are two non-limiting examples of proteins 
containing signal sequences. In certain embodiments, it is also beneficial to introduce a 
linker (commonly known) between the interaction domain and the heterologous polypeptide 
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portion. Such fusion protein find utility in the assays of the present invention as well as for 
purification purposes, detection purposes and the like. 

For certainty, the sequences and polypeptides useful to practice the invention 
include, without being limited thereto, mutants, homologs, subtypes, alleles and the like. 
It shall be understood that generally, the sequences of the present invention encode a 
functional (albeit defective) interaction domain. It will be clearto the person of ordinary skill 
that whether an interaction domain of the present invention, or variant, derivative, or 
fragment thereof, retains its function in binding to its partner can be readily determined by 
using the teachings and assays of the present invention and the general teachings of the 
art. 

As exemplified herein below, the interaction domains of the present invention 
can be modified, for example by in vitro mutagenesis, to dissect the structure-function 
relationship thereof and thereby permitting a better design and identification of modulating 
compounds. However, some derivative or analogs having lost their biological function of 
interacting with their respective interaction partner may still find utility, for example for 
raising antibodies. Such analogs or derivatives could be used, for example, to raise 
antibodies to the interaction domains of the present invention. These antibodies can be 
used for detection or purification purposes. In addition, these antibodies can also act as 
competitive or non-competitive inhibitors and be found to be modulators of an interaction 
identified in accordance with the present invention. 

A host cell or indicator cell has been "transfected" by exogenous or 
heterologous DNA {e.g., a DNA construct) when such DNA has been introduced inside the 
cell. The transfecting DNA may or may not be integrated (covalently linked) into 
chromosomal DNA making up the genome of the cell. In prokaryotes, yeast, and 
mammalian cells, for example, the transfecting DNA may be maintained on a episomal 
element such as a plasmid. With respect to eukaryotic cells, a stably transfected cell is one 
in which the transfecting DNA has become integrated into a chromosome so that it is 
inherited bydaughtercellsthrough chromosome replication. This stability is demonstrated 
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by the ability of the eukaryotic cell to establish cell lines or clones comprised of a population 
of daughtercellscontainingthetransfecting DNA. Transfection methods are well known 
in the art (Sambrook et al., 1989, supra; Ausubel et al., 1994 supra). The use of a 
mammalian cell as an indicator can provide the advantage of furnishing an intermediate 
factor, which permits or modulates the interaction of two polypeptides which are tested, that 
might not be present in lower eukaryotes or prokaryotes. Of course, an advantage might 
be rendered moot if both polypeptides tested directly interact. It will be understood that 
extracts from mammalian cells, for example, could be used in certain embodiments, to 
compensate forthe lack of certain factors in a chosen indicator cell. It shall be realized that 
the field of translation provides ample teachings of methods to prepare and reconstitute 
different types of extracts. 

In general, techniques for preparing antibodies (including monoclonal 
antibodies and hybridomas) and for detecting antigens using antibodies are well known in 
the art (see, e.g., Campbell, 1984, In "Monoclonal Antibody Technology: Laboratory 
Techniques in Biochemistry and Molecular Biology", Elsevier Science Publisher, 
Amsterdam, The Netherlands; and Harlow etal., 1988 In "Antibody- A Laboratory Manual", 
CSH Laboratories). The present invention also provides polyclonal, monoclonal 
antibodies, or humanized versions thereof , chimeric antibodies and the like, which inhibit 
or neutralize their respective interaction domains and/or are specific thereto. 

From the specification and appended claims, the term "therapeutic agent" 
should betaken in a broad sense so as to also include a combination of at least two such 
therapeutic agents. 

The DNA segments or proteins according to the present invention can be 
introduced into individuals in a number of ways. For example, erythropoietic cells can be 
isolated from the afflicted individual, transformed with a DNA construct according to the 
invention and reintroduced to the afflicted individual in a number of ways, including 
intravenous injection. Alternatively, the DNA construct can be administered directly to the 
afflicted individual, for example, by injection in the bone marrow. The therapeutic agent can 
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also be delivered through a vehicle such as a liposome, which can be designed to be 
targeted to a specific cell type, and engineered to be administered through different routes. 

For administration to humans, the prescribing medical professional will 
ultimately determine the appropriate form and dosage for a given patient, and this can be 
expected to vary according to the chosen therapeutic regimen {e.g., DNA construct, 
protein, molecule), the response and condition of the patient as well as the severity of the 
disease. 

Therapeutic composition within the scope of the present invention should 
contain the active agent {e.g., protein, nucleic acid, or molecule) in an amount effective to 
achieve the desired therapeutic effect while avoiding adverse side effects. Typically, the 
nucleic acids in accordance with the present invention can be administered to mammals 
{e.g., humans) in doses ranging from 0.005 to 1 mg per kg of body weight per day of the 
mammal which is treated. Pharmaceutically acceptable preparations and salts of the 
active agent are within the scope of the present invention and are well known in the art 
(Remington's Pharmaceutical Science, 16th Ed., Mack Ed.). For the administration of 
polypeptides, antagonists, agonists and the like, the amount administered should be 
chosen so as to avoid adverse side effects. The dosage is adapted by the clinician in 
accordance with conventional factors such as the extent of the disease and different 
parameters from the patient. Typically, 0.001 to 50 mg/kg/day are administered to the 
mammal. 

The methods and assays of the present invention have also been validated with 
Annexin. This protein is significantly different from P-glycoprotein in both structure and 
function. Consequently, together with the knowledge of protein chemistry and molecular 
biology, these validations support the utility of the instant assays and methods for all 
proteins (from viruses, living cells, animals, plants, etc.). 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Having thus generally described the invention, reference will now be made to 
the accompanying drawings, showing by way of illustration a preferred embodiment 
thereof, and in which: 

5 Figure 1 showsthe principle of protein-protein interaction. The plus signs (+) 

indicate the regions of high-affinity binding. The minus signs (--) indicate the regions of 
high-repulsive forces. As indicated in the text, interactions between two proteins are made 
up of discontinuous regions of high-affinity binding and high-repulsive forces that are almost 
in equilibrium with high-affinity binding being more favoured while proteins are together 

1 0 Figure 2 is a schematic representation of a method of identification of high- 

affinity binding sequences according to one aspect of the present invention. A, the different 
shapes represent different proteins in a total cell lysate. The signs are like for Figure 1 . 
B, small overlapping peptides that cover the entire sequence (ora segment) of protein. A 
is synthesized directly on derivatized wells of 96-well polypropylene plates. Following 

15 peptide synthesis, metabolically radiolabeled total cell lysate is added to each well 
containing the various peptides and incubated in an incubator buffer. C, The dark filled 
circles represent the radiolabeled proteins from total cell lysate isolated from metabolically 
radiolabeled cells added to all the wells of the 96-well plates to identify high-affinity binding 
sequences on Protein A. D, after an extensive washing, the high affinity binding sequences 

20 (overlapping peptides from Protein A) are in those wells that bind radiolabeled proteins (in 
dark) . Four high-affinity binding sequences between Protein A and another protein(s) are 
identified in rows 1,3,6 and 8. The wells that contain the high-affinity binding sequences 
are identified by radiolabeled counting and SDS-PAGE; 

Figure 3 is a schematic representation of a method of identification of high- 

25 affinity binding sequences according to another aspect of the present invention. A shows 
a schematic representation of the interaction between Protein A and Protein B. B, small 
overlapping peptides that cover the entire sequence (or a segment) of Protein A are 
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synthesized directly on derivatized wells of 96-well polypropylene plates. Following peptide 
synthesis, radiolabelled Protein B (synthesized from in vitro transcription-translation 
reaction mix) is added to each well containing the various peptides and incubated in an 
incubation buffer. C, the dark filled circles represent the radiolabeled Protein B that has 
been added to all the wells of the 96-well plates to identify high-affinity binding sequences 
on Protein A. D, after a washing procedure, the high affinity binding sequences are in 
those wells in which Protein B (radiolabeled protein in dark) is still bound to the peptides 
from Protein A. E, four high affinity binding sequences between Protein A and Protein B 
are identified in rows 1, 3, 6 and 8. The wells that contain the high-affinity binding 
sequences are identified by radiolabeled counting and SDS-PAGE; 

Figure 4 is a schematic representation of a method of selection of drugs that 
specifically inhibit the binding of protein A to B according to one aspect of the present 
invention, A shows a schematic representation of the interaction between Protein A and 
Protein B. B, peptides that encode high-affinity binding sequences are used as LEAD 
sequences for the selection of specific drugs that inhibit the association between Protein 
A and Protein B and ultimately the function of the complex. To target the high-affinity 
binding sequences that were identified in Figures 2 or 3, peptides including one of the 
high-affinity binding sequences are synthesized in every well of the 96-well plate. Grey 
circles represent one of four high-affinity binding sequences identified in Figures 2 and 3. 
C, following the addition of a compound to be tested to each well of the 96-well plate, 
radiolabeled Protein B is added to each of the wells. Of course, combinatorial libraries can 
be screened to identify drugs that bind specifically to the high-affinity binding sequences 
of Protein A. As previously, radiolabeled Protein B from transcription-translation reaction 
mix are represented. Plates are washed and drugs that specifically bind to high-affinity 
sequences of Protein A are found in those wells that do not contain radiolabeled Protein 
B. D, wells containing drugs/compounds that bind specifically to one of the high-affinity 
binding sequence in Protein A and therefore prevent the binding of Protein B are identified 
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by the absence of a dark circle {i.e., wells 28, 70 and 75). Selected drugs/compounds 
represent invaluable LEAD compounds that can be used in biological assays to confirm 
their mechanism of action. Validated drugs can proceed toward in vivo studies; 

Figure 5 shows a P-glycoprotein predicted secondary structure and amino acid 
5 of the linker domain. A schematic representation of P-gp predicted secondary structure. 
The twelve filled squares represent the twelve putative transmembrane domains. The two 
ATP binding domains are represented by two circles in the N- and C-terminal halves of P- 
gp. The inset represents the linker domain. The amino acid sequence of the linker 
domains of Human P-gp 1 (HP-gp1 ) and HP-gp3 is indicated as a single-letter amino acid 

1 0 code. The numbers in brackets at the beginning and end of each amino acid sequence of 
HP-gpl and HP-gp3 shows the length of the linker domains (1 - 90 and 1 - 88 for HP-gp1 
and HP-gp3, respectively). The numbered lines underneath the amino acid sequence show 
the sequences of the overlapping hexapeptides, which differ by one amino acid. For HP- 
gp3, the last hexapeptide Is number 88; 

15 Figure 6 shows the protein binding to overlapping hexapeptides encoding P- 

gp1 linker domain. Overlapping hexapeptides that encode the linker domain of HP-gpl 
were synthesized on polypropylene rods and used to identify proteins that bind to these 
peptides. A total of 90 plus two control hexapeptides for P-gpl were incubated with total 
cell lysate from [^S] methionine metabolically labeled cells (see methods). All bound 

20 proteins were eluted from the peptide-flxed rods and resolved on 1 0% SDS PAGE. Lanes 
1 to 92 show the pS] methionine bound proteins from P-gpl . The migration of the 
molecular weight markers is shown to the left of gels; 

Figure 7 shows the effects of different detergents or high salt on the binding of 
proteins to P-gp1 hexapeptides. Metabolically radiolabeled proteins bound to 

25 hexapeptides (hexapeptides 50 to 53) from P-gp1 linker domain were eluted In the 
presence of increasing concentrations of anionic detergent (0.12% - 0.5% SDS), 
Zwitterionic detergent (20 mM - 80 mM CHAPS) or Salt (0.3 M - 1 .2 M KCI). The y-axis 
represents the amount of radioactivity eluted from a pool of three hexapeptides (50 to 53); 
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Figure 8 shows the effects of CHAPS on the binding of proteins to the 
overlapping hexapeptides encoding P-gp1 linker domain. Overlapping hexapeptides of the 
linker domain of HP-gp1 were incubated with total cell lysate from pS] methionine 
metabolically labeled cells extracted with 10 mM CHAPS. Bound proteins were eluted 
5 from the peptide-fixed rods and resolved on 1 0% SDS PAGE. Lanes 1 to 92 show the 
pS] methionine bound proteins to P-gp1 linker domain. The migration of the molecular 
weight markers is shown to the left of gels; 

Figure 9 shows the protein binding to overlapping hexapeptides encoding P- 
gp3 linker domain. Overlapping hexapeptides that encode the linker domain of HP-gp3 

1 0 were synthesized on polypropylene rods and used to identify proteins that bind to these 
peptides. Atotal of 88 plus two control hexapeptides for P-gp3 were incubated with total 
cell lysate from pS] methionine metabolically labeled cells. All bound proteins were eluded 
from the peptide-fixed rods and resolved on 10% SDS PAGE. Lanes 1 to 90 show the 
pS] methionine bound proteins from P-gp3. The migration of the molecular weight 

15 markers is shown to the left of gels; 

Figure 1 0 shows the sequence alignment of three binding regions of P-gpl and 
P-gp3 linker domains. Alignment of P-gpl and P-gp3 linker domains is shown using a 
single-letter co6e for amino acids. The regions of high binding affinities for P-gp3 and P- 
gp1 are shown in bold. Identical amino acids are shown by single letter code between the 

20 two aligned sequences. Conserved amino acids are indicated by plus (+) sign. The 
numbers on each side of the amino acid sequence of the linker domains referto the amino 
acid sequence of human P-gpl and 3 as in (Roninson et al.,1 986,Proc. Natl. Acad. Sci. 
USA 83:4538-4542; Van der Bliek et al., 1987, The EMBO Journal 6:3325-3331); 

Figure 1 1 shows the two high affinity binding hexapeptides. Two high affinity 

25 binding sequences ^^RSSLIR^^^ (SEQ ID N0:7) and ^^SVRGSQ^^^ (SEQ ID N0:8) from 
P-gpl linker domain were resynthesized and Incubated with total cell lysate from pS] 
methionine metabolically labeled cells following 24 hour or 48 hour incubation times. 
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Bound proteins were eluted from peptide-fixed rods and resolved on 10% SDS PAGE. 
The migration of the molecular weight markers is shown to the left of the figure; 

Figure 1 2 shows the effects of different carrier proteins as blocking agent of 
unspecific binding. Total cell lysatesfrom pS] methionine metabolically labeled CEM cells 
were used as is or made 1% gelatin, 0.3% BSA or 3% BSA. The cell lysates were 
incubatedwith a high affinity binding hexapeptide^^teSUR^^fromP-gp1 linker domain. 
The bound proteins were eluted with SDS sample buffer and resolved on 1 0% SDS PAGE. 
The migration of the molecular weight markers is shown to the left of the figure. 

Figure 13 shows the purification of a 57 kDa protein. Total cell lysate was 
incubated with fifty P-gpl hexapeptides ^^RSSLIR^^^ and ^^^SVRGSQ®^^ Samples 
containing the 57 kDa protein (P57) from one hundred hexapeptide incubation mix were 
pooled and resolved on 10% SDS PAGE. The resolved proteins were transferred to 
PVDF membrane and stained with Ponseau S. The migration of the molecular weight 
markers is shown to the right of the figure; 

Figure 14 shows the western blot analysis with anti-tubulin monoclonal 
antibodies. Total cell lysate from CEM cells and proteins eluded from the high affinity 
binding hexapeptides of P-gp1 linker domain (P57) were resolved on SDS PAGE and 
transferred to nitrocellulose membrane. One half of the membrane was probed with anti-a 
and anti-3 tubulin monoclonal antibodies. The migration of the molecular weight markers 
is shown to the left of the figure; and 

Figure 1 5 shows the helical wheel presentations of the high affinity binding 
region of P-gp1 and P-gp3 linker domains. The s/ng/e-/ef?eramino acid code for the high 
affinity binding region of P-gpl and P-gp3 linker domains are shown. The positively 
charged amino acids on one side of the helix have been circled. 

Other objects, advantages and features of the present invention will become 
more apparent upon reading of the following non-restrictive description of preferred 
embodiments with reference to the accompanying drawing which is exemplary and should 
not be interpreted as limiting the scope of the present invention. 
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DESCRIPTION OF THE PREFERRED EMBODIMENT 



The function or functions of proteins is mediated through an interaction thereof 
with other cellular or extracellular proteins. Until now it was thought that interactions 
between two proteins involve large segments of polypeptides that have complementary 
amino acid sequences. However, it is not known how these complementary sequences 
mediate the interactions between proteins. In this application, a novel concept to explain 
the principle of protein-protein interactions is proposed. Briefly, interactions between any 
two or more proteins are mediated by strings of discontinuous sequences with high-affinity 
binding and high-repulsive forces (see Figure 1 ). The sum of these forces over the entire 
exposed sequence of proteins determines the nature and extent of the interactions 
between proteins. The sizes of these interacting domains can vary from 5 to 25 amino 
acids in length. The attractive forces between two small high-affinity binding sequences are 
generally larger than the sum of all the high-affinity binding and repulsive-forces between 
two proteins. Therefore, using the present approach, it is possible to isolate interacting 
proteins from a mixture of proteins using a short peptide (almost six amino acids) that 
encodes only the high affinity binding sequence. Indeed, with this in mind, it is now easy to 
see why many methods attempting to isolate interacting proteins have failed. The use of 
large fragments or proteins to isolate interacting proteins is less efficient since the sum of 
attractive/repulsive forces are much weaker than any string of attractive forces. The herein 
proposed principle is also consistent with the fact that protein-protein interactions can be 
modulated by post-translation modifications {e.g., by phosphorylation) and the presence 
of other interacting proteins. Hence, the addition or loss of weak forces following post- 
translation modification can disrupt the tenuous balance between high-affinity binding and 
high-repulsive forces that hold proteins together or prevent their association . Support for 
the magnitude of attractive forces between two high-affinity binding sequences is 
demonstrated in antibody-antigen binding whereby the antigen can be only of a few amino 
acids. Furthermore, numerous examples exist in biology were cellular interactions between 
proteins occur due to the presence of small consensus sequence of five to ten amino acids. 
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Non-limiting examples of such small consensus sequences include the leucine zipper, and 
SH2 and SH3 binding sequences, in addition to the domains of interactions between two 
or more proteins (indicated above), protein-protein interactions can have many measurable 
effects, such as: 1) changes in the kinetic properties of one or both proteins; ii) formation 
5 of new binding or functional sites; and iii) the inactivation of function(s)- In other words, a 
given protein could expose different functional domains or sequences in the presence as 
opposed to the absence of any interacting proteins. Thus, in the presence of protein B, 
protein A can expose other sequences not previously exposed for interactions with other 
proteins. The latter concept is very important as it argues against the effectiveness of 
10 some structural studies (/.e.. X-ray and NMR) in predicting functional or surface exposed 
domains from the resolved crystal structure of proteins. By enabling the measurement and 
the identification of potentially all the high-affinity binding sites of a given protein, the 
present invention seeks to overcome the drawbacks of the results obtained from such 
structural studies. 

15 Further to the above examples of protein-protein interactions, a subset of 

protein-protein interactions is dimerization. There is an abundance of examples in biology 
whereby protein-protein interactions are essential for activation or inhibition of function. 
Non-limiting examples of homo- or heterodimers include; growth factor receptors); 
membrane transport proteins; tumor suppressor proteins; and proteins that mediate 

20 apoptosis. In fact, dynamic dimerization is a common theme in the regulation of signal 
transduction. Some of the functional consequences of dimerization include, increased 
proximity for activation of single transmembrane cell surface receptors (e.g., EGF receptor) 
and differential regulation by heterodimerization {e.g., BCL2 family of proteins). 

The protein concentration in living cells is very high and is in the range of 1 0-30 

25 mg/ml. At this high protein concentration, most if not all proteins should interact precisely 
and specifically with other cellular proteins. Some of the interacting proteins act as 
inhibitors of function, while others may be activators {e.g. The BCL2-BAX family of 
proteins). Moreover, the cycling of a given protein between activator and inhibitor 
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association will require the association-dissociation process to occur rapidly. For 
example, when protein X is associated with an inhibitor protein I, the domains (small 
sequences) that are required for the association of protein X with an activator protein A 
may not be easily accessible in the X-l complex. Therefore, current methods to identify 

5 associated protein {i.e., the two- hybrid system and similar approaches) may not be able 
to identify all associated proteins. In otherwords, current methods, when successful, may 
only identify some but not all functional domains and their associated proteins. By contrast, 
using the peptide scanning approach, the method of the present invention is capable of 
identifying all functional domains or high-affinity interacting domains of protein X and its 

1 0 associated proteins. Once the associated proteins are identified, their biological functions 
as it relates to the target protein X can be tested. Thus, for a given interacting protein, 
should its interaction with one or many possible associated proteins prove to be important 
for function, the high-affinity binding sequences (between protein X and Protein I or A) can 
be easily identified and can be used as a target site in a high throughput drug screening 

15 assays (see below) or other assays. 

This invention includes the concept (described in Figures 2A-D) that protein- 
protein interactions are made-up of discontinuous high-affinity binding and high-repulsive 
forces scattered throughout the 3D sequence of proteins and that these sequences can be 
isolated using one of many possible approaches indicated herein [e.g., the overlapping 

20 peptide approach). Although in this application the overlapping peptide approach is 
exemplified, other approaches can be envisioned that give similar results. It should be 
stressed that the approach described herein is immune to conformational changes 
resulting from interacting proteins that could affect other commonly used methods to identify 
protein-protein interactions {e.g., two-hybrid system, affinity blotting, and crosslinking). In 

25 the two hybrid system, for example. Protein A is fused with another protein sequence (the 
DNA-bound "bait" protein) and the other interacting protein is fused to the activation- 
domain containing "prey" protein. The fusion of interacting proteins to protein A could 
expose regions other than those found in the native conformation which will affect their 
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interactions. Furthermore, the two-hybrid system has several disadvantages, some of 
which are listed below: 

i. The interaction of proteins is monitored in the nuclear milieu rather than the 

cytoplasm where most proteins are found; 
5 ii. Proteins can be toxic when expressed in different cells or organisms; 

iii. The interactions between two proteins in a complex in the two-hybrid system 
can sterically exclude the binding of other interacting proteins; 

iv. The post-translational modification of one protein can exclude its interaction 
with other proteins; 

1 0 V. The two-hybrid system does not allow the simultaneous identification of the 

precise amino acid sequences between two interacting proteins; 
p vi. The application of the two-hybrid system is associated with high percentage 

ju' of false positives; or 

H vil. The two-hybrid system cannot be easily applied to different cell types or tissues 

m 

IM^ 1 5 whereby different interacting proteins may be expressed (this can be a critical drawback 
of this system). 

iM' Method to Identify Interacting Proteins and Sites of Interactions for Protein A 

jp The present approach and methodology used to identify discontinuous strings 

20 of sequences between two or more interactive proteins is a scanning overlapping peptide 
approach. Using this approach , a large number of short overlapping peptides which cover 
the entire amino acid sequence of a given protein "the bait" are synthesized in parallel on 
an inert solid support (see Figure 2). The rationale for synthesizing a large number of 
overlapping peptides as opposed to a discontinuous peptide library is based on the fact 
25 that one does not know a prioriwhat exact sequence of a given protein will contain the high 
affinity binding sites and the repulsive sequences. Therefore, a discontinuous peptide 
approach will often lead to the presence of both high affinity binding sequences and 
repulsive sequences in the same peptide. Such peptides will not bind to potential 
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interacting proteins with high affinity. Moreover, the use of overlapping peptides also 
provides internal controls for unspecific binding. For example, using overlapping peptides, 
the high affinity binding sequences will give a peak of signal when peptides within the high 
affinity domain will have the high affinity amino acid sequences but will lack amino acids 
5 which provide the repulsive forces (see Figure 6 in Example I). Of course, it should be 
understood that the present invention is not dependent on a spanning of the full peptide 
sequence. Indeed, sub-region(s) of a protein can be used. In addition, overlapping 
peptides can be derived from a chosen domain of a protein. Also, it would be 
envisageable to probe an overlapping peptide side set of a first protein with an overlapping 

10 peptide set of a second protein. 

To demonstrate how one can use this approach of overlapping peptides as "a 
bait" to isolate interacting proteins "the prey" or "preys" from a mixture of total cell proteins, 
the following example can be considered. P-glycoprotein is a membrane protein that 
confers resistance to anticancer drugs and, therefore, is responsible for the failure of 

1 5 chemotherapy. Although, P-glycoprotein has been shown to function by preventing the 
accumulation of chemotherapeutic drugs in tumor cells, the exact mechanism of how this 
protein functions and what are the associated proteins that modulate its function are not 
known. Thus, itisof interest to identify proteins that interact with P-glycoprotein, such as 
to enable an inhibition of binding between P-glycoprotein and its associated proteins, 

20 thereby potentially modulating its function in resistant tumor cells. In this example, itwas 
of interest to identify those proteins which bind to the linker domain of P-glycoprotein. Thus, 
in this particular example, a domain of a chosen protein was used. The linker domain, 
encodes a region of about 90 amino acids. Thus, overlapping hexapeptides covering this 
entire linker sequence of P-glycoprotein were synthesized onto a solid support using 

25 standard F-moc chemistry. The covalently fixed peptides (on a solid support) were 
incubated with a total cell lysate isolated from cells metabolically with pS]methionine. The 
peptides and total cell lysate were incubated in the presence of a carrier substrate (1 -3% 
Bovine Serum Albumin, or 1-3% gelatin, 1-3% Skim milk, etc.) for 18 hours at 4^C. 
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Following this incubation period, the covalently fixed peptides were washed extensively with 
isotonic buffer. Any proteins from the radiolabeled total cell lysate which maintained their 
association with the overlapping hexapeptides following the washing step are eluted in 
SDS-contain sample buffer and analyzed on SDS polyacrylamide gel electrophoresis 
5 (SDS-PAGE)(Laemmli, U.K., 1970, Nature 227:680-685). The presence of radiolabeled 
proteins on SDS-PAGE following gel drying and signal enhancement, provides the 
following information: 

1 ) those specific overlapping peptides represent high affinity binding sequences in the P- 
glycoprotein linker domain (or other chosen domains or non-chosen domains); and 
1 0 2) the proteins bound to the specific overlapping peptides are associated proteins (see 
Figure 6). 

The associated proteins which bound to the high affinity binding sequences, 
can be isolated in large quantities for the purpose of determining their identity by N-terminal 
amino acid sequencing by Edman degradation (Edman and Begg, 1 967, Eur. J. Biochem. 

15 1:80-91 ) or the like. Briefly, the sequences of the overlapping peptides that bound to a 
given protein are resynthesized on a solid support and kept fixed thereto. Total cell lysate 
from [^^S]methionine metabolically radiolabeled cells is added to the solid support 
containing the fixed high affinity sequence peptides and incubated as described above. 
Following washing steps to remove unbound material, the associated protein is isolated 

20 in large amounts following an elution step with SDS-containing buffers (see below). The 
purified associated protein is now ready for amino acid sequencing. Of course, should 
further purification steps be required, they are well known to the skilled artisan. The purified 
protein is run on SDS-PAGE and the resolved protein is transferred to PVDF membrane 
as previously described (Ausubel et al., 1994, supra). Other methods for amino acid 

25 sequence determination can also be easily applied (Ausubel et al., 1994, supra). 
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Method to Identify the Amino Acid Sequences Between Two Interacting Proteins 

The same concept as described above can be applied if one is only interested 
in identifying the high affinity binding sequences between two proteins. A non-limiting 
example of such two proteins are the regions of interactions between p53 and MDM. 
5 Specifically therefore, the purpose of this exercise is to identify the high affinity binding 
sequences between proteins A (p53) and protein B (MDM) in order to use these 
sequences as target sites for the identification of compounds that modulate this interaction 
and more particularly forthe development of drugs. Thus, in one aspect, when a given drug 
is bound to one of these high affinity binding sites on protein A, it will prevent the formation 

10 of the active complex (protein A+B) andtherefore inhibit the functions of the complex. To 
isolate the string of high-affinity binding sequences between Protein A and B (see Figure 
3), small overlapping peptides (five to seven amino acids) that cover the entire amino acid 
sequence of protein A, "the bait," are synthesized in parallel onto a solid support (as 
mentioned above and described in more detail in Example 3). Note that in this particular 

1 5 aspect, only the primary amino acid sequence of protein A, "the bait," is needed. Once the 
peptides are synthesized (peptide synthesis is done parallel on a solid support in 96-well 
plates), an enriched and radiolabeled full-length protein B, "the prey," is added to each wel I 
of the 96-well plate that contain the covalently fixed overlapping peptides (the radiolabeled 
protein B is easily obtained from in vtfro transcription-translation reactions). The peptides 

20 encoding protein A are incubated with radiolabeled protein B to allow binding to occur. 
Following an incubation period (5 to 24 hours), unbound radiolabeled protein B is removed 
by extensive washing in isotonic buffer. Any overlapping peptides which bound to 
radiolabeled protein B are eluted in the presence of denaturing agents. The eluant from 
each of the 96-well plates is analyzed forthe presence of radiolabeled protein B by running 

25 the samples on SDS PAGE. High-affinity binding peptides are identified as those that 
retain the radiolabeled Protein B. 

The use of metabolically radiolabeled proteins as "the prey" to interact with the 
overlapping peptides of "the bait" increases the sensitivity of this technique and allows the 
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identification of interacting proteins with binding affinities of 1 0"^° - 1 0'^^ M for a standard 
50 kDa protein which encodes one to ten radiolabeled methionine residues. 

Method to Use High Affinity Binding Sequences in High Throughput Assays to 
5 Screen for Lead Compounds 

The approach, described herein, to identify high-affinity binding sequences or 
target sites for drug development can also be used in high throughput assays to screen for 
small molecules from combinatorial libraries. For example, to select drugs that specifically 
inhibit the binding of protein A to B (see Figure 4). One or more target sites (the high 

1 0 affinity binding sequences) are synthesized in each of the 96-well plates as described 
earlier. In this example (Figure 4) the same high affinity binding sequence is synthesized 
in all of the wells. To each well containing the high affinity binding sequence, one or more 
small molecules from combinatorial library are added. Following the addition of drug(s), 
a radiolabeled protein B from an in v*ro transcription-translation mix, for example, is added 

1 5 and allowed to incubate as indicated above. Following several washes, bound protein B 
is eluted with SDS-sample buffer. Wells that are found as containing radiolabeled protein 
B indicates that the drug had no effect on the binding between the high affinity binding 
sequence and protein B. Alternatively, if one or more wells do not contain radiolabeled 
protein B in the presence of a drug, then that drug has inhibited the interactions between 

20 the high affinity binding sequence of protein A and protein B. Hence, the latter drug is a 
good lead compound. These drugs can now enter the second phase of their analysis to 
determine if they prevent the formation of the active complex of full length protein A and B. 
Active drugs that are identified are tested in vivoto further confirm their mechanism of 
action. In this manner, more specific drugs with fewer or no side-effects are developed 

25 The latter point provides an advantage since most proteins have more than one 

biological function. For example, if protein A interacts with itself, it has one function, while 
the same protein interacting with a different protein has a different function. Moreover, 
protein A, when part of a given complex of associated proteins, mediates several functions. 
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For example, inhibiting the interactions between proteins A and B while leaving the 
interactions between proteins A and C, D, or F intact inhibits one or few cellular pathways. 
By contrast, inhibiting the function of protein A inhibits the functions of the entire complex. 
In this respect, the identification, isolation and development of drugs that specifically inhibit 
5 interactions between two proteins within a complex of proteins should result in more 
specific drugs with fewer side effects. In addition, as different proteins are differentially 
expressed in different tissues or organs, the composition of a given protein complex is 
different between different tissues. Hence, the approach of developing drugs that inhibit 
protein-protein interactions leads to drugs that are organ or tissue specific. 
10 Of course, it will be understood that the present invention also provides 

La; quantitative assays to measure the protein-protein interaction and the modulation thereof 

^; by compounds. 

U In conclusion, the approach described in this application for the identification 

2 interacting proteins, the precise amino acid sequence between interacting proteins, and 
5^; 15 targeting of such specific sequences in proteins with drugs that inhibit protein-protein 
interactions have tremendous potential and are in dictating future drug discovery in the 
y. pharmaceutical industry. 

p The present invention is illustrated in further detail by the following non-limiting 

20 examples. 

EXAMPLE 1 

P-Glvcoprotein Binding to Tubulin is Mediated 
by Sequences in the Linker Domain 

25 The successful treatment of cancer patients with chemotherapeutic drugs is 

often limited by the development of drug-resistant tumors. Tumor cell lines selected, in 
vitro, with a single anticancer drug become resistant to a broad spectrum of 
chemotherapeutic drugs, termed multidrug resistant (or MDR) tumor cells (for review, see 
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(Coleetal., 1996, CancerTreatment& Research 87:39-62;Gottesmanetal., 1995, Annu. 
Rev. Genet. 29:607-649; and Ling, V., 1997, Cancer Chemother. Pharmacol 40:Suppl:S3- 
8). Moreover, the expression of IVIDR in these tumor cells has been associated with the 
overexpression of two membrane proteins: the MDR1 P-glycoproteIn (P-gp) and the 
5 multidrug resistance-associated protein (MRP1 ) (Cole et al., 1 996, CancerTreatment & 
Research 87:39-62; Gottesman etal., 1995, Ann. Rev. Genet. 29:607-649; and Ling, V., 
1 997, Cancer Chemother. Pharmacol. 40:Suppl:S3-8). Both P-gp and MRP are members 
of a large family of membrane transporter proteins known as ATP Binding Cassette 
proteins or ABC membrane transporters (Higgins, C. F., 1992, Annual Review of Cell 
10 Biology 8:67-113). Although, the structure of P-gp1 remains a matter speculation 
I* (Rosenberg et al. , 1 997, Joumal of Biological Chemistry. 272: 1 0685-1 0694) , cumulative 
g topological evidence suggest a tandemly duplicated structure of six transmembrane 
domains and a large cytoplasmic domain encoding an ATP binding sequence (Kast et 
JJ al.,1997, Journal of Biological Chemistry 222:26479-26487; Loo et al., 1995, Journal of 
g 15 BiologicalChemistry2Z0:843-848). Thetwo halves of P-gp 1 are linked by a stretch of 90 
? residues rich in polar or charged amino acids, termed the Linker domain. 

The P-gp gene family is made up of three structurally similar isoforms in 
rodents (classes I, II, and III) and two isoforms in humans (classes I and III) (Childsetal., 
p 1994, Important Adv. Oncol., pp. 21-36). Gene transfer studies suggest functional 
20 differences among these structurally similar isoforms. For example, only the P-gp isoforms 
of classes I and II conferthe MDRphenotype (Devault et al., 1 990, Molecular and Cellular 
Biology 10:1652-1663; Van derBliek etal., 1987, The EMBOJournal6:3325-3331), while 
the class III isoforms do not (Buschman et al., 1994, Cancer Research 54:4892-4898; 
Schinkel etal., 1991, Cancer Research 51:2628-2635). Theclass III isoforms mediate the 
25 transferof phosphatidylcholine fromtheinnertotheouter leaflet of the plasma membrane 
(/.e., "flipase") (Ruetz et al., 1 994, Journal of Biological Chemistry 269: 1 2277-1 2284; Smit 
et al., 1993, Cell 75:451-462). In normal tissues, P-gp distribution is restricted mainly to 
tissues with secretory functions (O'Brien etal., 1 996, Multidrug Resistance in CancerCells: 
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Molecular, Biochemical, Physiological and Biological Aspects. Editors: Gupta, S. and 
Tsuruo, T., John Wiley & Sons, pp. 285-292; Weinstein et al., 1990, Human Pathology. 
21:34-48). Its polarized localization to apical surfaces facing a lumen in the adrenal gland, 
liver, kidney intestine suggests a normal transport or detoxification mechanism. Moreover, 
hematopoietic stem cells and specific lymphocyte subclasses also express high levels of 
P-gp (Gupta, S., 1996, Multidrug Resistance in Cancer Cells: Molecular, Biochemical, 
Physiological and Biological Aspects. Editors: Gupta, S. and Tsuruo, T., Willy, NY, pp.293- 
302). The normal function or substrate(s) of the classes I and II remain undefined; however, 
the disruption of the class I or/and II genes from the mouse genome results in the 
accumulation of cytostatic drugs or lipophilic compounds in most normal tissues, but more 
strikingly in the brain (Schinkel et al., 1 994, Cell 77:491 -502; Smit et al. , 1 993, Cell 75:451 - 
462). Based on these results it is speculated that the normal function of P-gp (the class I 
and II or the MDR causing P-gp) is detoxification similar to that seen in MDR cells, 
especially at the blood brain barrier(Jolliet-Riantetal.,1999, Fundam. Clin. Pharmacol. 
13:16-26). 

High levels of P-gp have been found in many intrinsically drug resistant tumors 
from colon, kidney, breast and adrenals as well as in othertumors which had acquired the 
MDR phenotype after chemotherapy (for example, in acute non-lymphoblastic leukemia) 
(Cornelissen etal., 1994, Journal of Clinical Oncology 12:1 15-1 19; Fitscheretal., 1993, 
Analytical Biochemistry 213:414-421 ;Futscher et al., 1993, Analytical Biochemistry 
213:414-421 ;; Grogan etal., 1990, Laboratory Investigation 63:81 5-824; Henweijeretal., 
1990, Joumalofthe National Cancerlnstitute 82:1133-1140; Nooteretal., 1994, Leukemia 
Research 18:233-243). Several studies have now established an inverse correlation of P- 
gp expression and the response to chemotherapy (Bates et al., 1995, Cancer 
Chemotherapy&Pharmacology.35:457-463;Roetal., 1990, Human Pathology. £1:787- 
791 ; Verrelle etal., 1991, Journal of the National Cancer Institute 83:1 1 1-1 16). Further, 
Chan et al. (Chan et al., 1 995, Hematology - Oncology Clinics of North America 9:275- 
31 8;Chan etal., 1 991 , New England Journal of Medicine 325:1 608-1 614) have shown that 
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P-gp expression was prognostic of MDR and durable response in childhood leukemia, soft 
tissue sarcomas and neuroblastomas of children. In light of these studies there appears 
to be convincing evidence, at least in some cancers, that P-gp levels predict the response 
to chemotherapeutic treatment. 

Direct binding between P-gp and various lipophilic compounds has been 
demonstrated using photoactive drug analogues (Nare et al., 1994, Biochemical 
Pharmacology 48:221 5-2222; Safa et al., 1 986, J. Biol. Chem. 261:6137-6140; Safa, A. 
R., 1993, Cancer Investigation 11:46-56). Certain compounds which bind to P-gp were 
shown to reverse the MDR phenotype presumably by competing forthe same drug binding 
site in P-gp (Ford et al., 1990, Pharmacological Reviews 42:155-199; Georges et al., 
1990, Advances In Pharmacology 21:185-220). These compounds, which have been 
collectively labeled as MDR-reversing agents, include verapamil, quinidlne. Ivermectin, 
cyclosporins, and dipyrimadol analogues to name but few (Ford et al., 1990, 
Pharmacological Reviews42:1 55-199; Georges etal., 1 990, Advances in Pharmacology 
£1:1 85-220). Clinical trials using mofr-reversing agents {e.g., verapamil or quinidlne) have 
shown some response in tumors that were othenA/ise non-responsive to chemotherapy 
(Dalton etal. ,1995, Cancer 75:81 5-820; Goldstein, L. J., 1995, CurrProbI Cancerl9:65- 
124; Wigler, P. W., 1996, J Bioenerg Biomembr 28:279-84). However, high 
pharmacological toxicity associated with several mdr-reversing agents has prevented their 
use at effective concentration (List et al., 1993, Journal of Clinical Oncology 11:1652- 
1660). A better clinical response has been observed using other mdr-reversing agents 
{i.e., cyclosporin A and its non- immunosuppressive analog PSC833); however toxic 
effects have also been seen with cyclosporins (Sonneveld et al., 1994, Journal Clinical 
Oncology 12:1584-91 ; Watanabe et al., 1995, Acta Oncologica 34:235-241) 

P-gp was shown to be a substrate for protein kinases C and A (Ahmad et al ., 
1994, Biochemistry 33:10313-10318; Chambers et al., 1994, Biochemical Journal 
299:309-31 5). Moreover, it has been demonstrated that agents, which modulate protein 
kinase C activity, modulate P-gp phosphorylation and its MDR-mediate phenotype (Bates 
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etal., 1993, Biochemistry 37:91 56-91 64; Chambers etal., 1990, Biochem. Biophys. Res. 
Commun. 169:253-259.). in one study (Fine et al., 1988, Proc. Natl. Acad. Sci. USA 
85:582-586), PIVIA phorboi ester (a protein kinase C activator) was shown to increase the 
MDR phenotype and drug effluxin IViCF? breast cancer cells. In another study (Bates et al., 
1 992, Biochemistry 31:6366-6372), sodium butyrate treatment of SW620 human colonic 
carcinoma cells was shown to result in a large increase in P-gp expression without a 
concomitant increase in drug-resistance or-efflux. Interestingly, P-gp in SW620 cells was 
also shown to be poorly phosphorylated following sodium butyrate treatment (Bates et al., 
1 992, Biochemistry 31:6366-6372). Taken together, the lack of transport function of P-gp 
in SW620 cells was not clear, however mutations of P-gp phosphorylation sites within the 
linkerdomain was shown notto affect itsdrugtransport function (Germannetal., 1996, J. 
Biol. Chem. 271:1 708-1 6). By contrast, protein kinase C modulation of serine/threonine 
residues in the linkerdomain regulated the activity of an endogenous chloride channel and 
thus suggests that P-gp is a channel regulator (Gill et al., 1 992, Cell 71:23-32; Valverde et 
al., 1 992, Nature 355:830-833). Thus, although it remains unclearwhatfunctionsthe linker 
domain of P-gp1 mediates, it was of interest to identify the proteins that interact with linker 
domain using an in vitro assay. The latter assay is based on the novel understanding of 
protein interactions provided by the present invention. The results show hereinbelow that 
three sequences in the linkerdomain bind to proteins with apparent molecular masses of 
-80 kDa, 57 kDa and 30 kDa. Purification and partial N-terminal amino acid sequencing 
of the 57 kDa protein showed that it encodes the N-terminal amino acids of a and 3- 
tubulins. 

Thus, using a protein domain as an example of a validation of the power of the 
present invention, it was demonstrated that: i) this domain is bound specifically to proteins; 
ii) the specifically binding proteins can be fonnerly identified; and iii) the sequence 
responsible forthe specific binding of these proteins formerly identified (together with the 
interacting domain of this binding protein, if derived). 
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EXAMPLE 2 

Materials 

[^^S] methionine (1000 Ci/mmol; Amersham Life Sciences, Inc.) and [^^^1] goat 
anti-mouse antibody were purchased from Amersham Biochemical Inc. Protein-A 
5 Sepharose-4B was purchased from Blo-Rad Life Science. All other chemical used were 
of the highest commercial grade available. 

EXAMPLE 3 

Peptide Synthesis 

10 Prederivatlzed plastic rods, active ester and polypropylene trays were 

purchased from Cambridge Research Biochemicals (Valley Stream, NY). Peptides were 
synthesized on solid polypropylene rods as previously described (Georges et al., 1 993, 
Journal of Biological Chemistry 268: 1792-1 798; Georges etal., 1991 .Journal of Cellular 
Physiology 148:479-484). Briefly, the F-moc protecting group on the prederivatized 

15 polypropylene rods as solid support (arranged in a 96-well formate) was removed by 
incubation with 20% (v/v) piperidine In dimethylformamide (DMF) for 30 minutes with 
shacking. Following the deprotection of the p-alanine spacer on the polypropylene rods, 
Fmoc protected amino acids were dissolved in HOBt/DMF and added to the appropriate 
wells containing deprotected rods. Coupling of amino acids was allowed to take place for 

20 1 8 hours at room temperature after which the rods were washed in DMF (1 X2 minutes), 
methanol (4X2 minutes), and DMF (1X2 minutes). The coupling of the second amino 
acid required the deprotection of the F-moc amino protecting group of the first amino acid 
and incubation of the rods with the second preactivated F-moc protected amino acids 
(pentaf luorophenyl derivatives). The reaction was allowed to proceed for 1 8 hours and the 

25 rods were removed and washed as indicated above. The same steps were repeated for 
each amino acid coupling until the sixth amino acid was coupled. Following the last 
coupling step, the F-moc N-terminal protecting group was removed with 20% 
piperidine/DMF and the free amino group acetylated for 90 minutes in an acetylation 
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cocktail containing acetic anhydride: diisopropylethylamine (DIEA): DMF (50:1 :50 v/v/v). 
The side chain protecting groups of the N-terminal acetylated hexapeptides onto the 
polypropylene rods were removed by incubation in a cleavage mixture containing 
trifluoroacetic acid: phenol: ethandithiol (95:2.5:2.5 v/v/v) for 4 hours at room temperature. 
5 After the cleavage step the rods were washed with dichloromethane (DCIVl) and neutralized 
in 5% (v/v) DIE/VDCM. The deprotected peptide-coupled rods were washed in DCM, 
methanol and vacuum dried for 18 hours. 

EXAMPLE 4 

10 Tissue Culture and Metabolic Labeling of Cells 

Drug sensitive (OEM) and resistant (GEM/VLB^ °) cells were cultured in a- 
MEM media supplemented with 1 0% fetal calf serum (Hyclon, Inc.) as previously described 
(Beck, W. T., 1983, Cancer Treat. Rep. 67:875-882). All cells were examined for 
Mycoplasma contamination every three months using the Mycoplasma PCR kit from 
15 Stratagene Inc., San Diego, CA. For metabolic labeling of cells, CEM or CEM/VLB^ ° cells 
at 70-80% confluency were metabolically labeled with [^^S] methionine (1 00 |iCi/ml) for 6 
hours at 37°C in methionine-free a-MEM media. 

EXAMPLE 5 

20 Cell extraction and Binding Assay 

Following metabolic labeling of proteins with pS] methionine, cells were 
washed 3 times with phosphate buffered saline (PBS) and resuspended in hypotonic buffer 
(1 0 mM KCI, 1 .5 mM MgClg, 1 0 mM Tris-HCI, pH 7.4) containing protease inhibitors (2 mM 
PMSF, 3ng/ml Leupeptin, 4 ng/ml pepstatin A and 1 |ig/ml aprotinin) and kept on ice for 
25 30 minutes. Cells were lysed by homogenization in a hypotonic buffer and the cell lysate 
was sequentially centrifuged at 6000 xg for 10 minutes. Following the latter centrifugation, 
the supernatant was removed and made 0.5 M NaCI final concentration from a stock 
solution of 4 M NaCI. The cell lysate was incubated on ice for 30 minutes. The sample was 
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mixed and brought backto 0.1 M NaCI final concentration. The cell lysate was centrifuged 
for 1 0 minutes at 1 5,000 Xg at4°C. The latter supernatant was removed and recentrifuged 
at 1 00,000 Xg for 60 minutes in a Beckman ultracentrifuge using SW55 rotor. The amount 
of protein in the above samples was determined by the method of Lowry (Lowry et al., 
5 1951, J. Biol. Chem. 193). 

For a binding assay, [^^S] methionine labeled proteins from total cell lysate 
were mixed with equal volume of 3-6%BSA in phosphate buffered saline (PBS) and 
incubated with overlapping hexapeptldes covalently fixed to polypropylene rods. The 
peptides and total cell lysate were incubated overnight at 4°C. The rods were then 

1 0 removed and washed four times in PBS. The bound proteins were eluted by incubating the 
peptide-fixed rods in 1X SDS sample buffer for 60 minutes at room temperature with 
shacking. The peptides-fixed rods, were regenerated by incubation in PBS, containing 2% 
SDS and 1 mM (3-mercaptoethanol at 65°C in a sonicator for 30 minutes. Following the 
latter incubation, the rods were washed for five minutes in 65°C ionized water and two 

1 5 minutes in 65°C methanol . The peptides-fixed rods were now ready for the next round of 
screening. In cases where the effects of various detergents on binding was tested, pS] 
methionine labeled proteins from total eel I lysate were mixed with equal volume of 3%BSA 
in phosphate buffered saline containing KCI (300 mM to 1 200 mM), SDS (0.12% to 2%), 
or CHAPS (20 mMto 1 60 mM) and incubated with covalently fixed peptides as described 

20 above. 

EXAMPLE 6 

Polvacrvlamide Gel Electrophoresis and Western Blotting 

Protein fractions (1 00-1 50 fA) were resolved on SDS-PAGE using the Laemmll 
25 gel system (Laemmli, U. K., 1 970, Nature 227:680-685). Briefly, proteins were dissolved 
in 1 X solubilization sample buffer I (62.5 mM Tris-HCI, pH 6.8, containing 2% (w/v) SDS, 
10% (w/v) glycerol and 5% p-mercaptoethanol) and samples were electrophoresed at 
constant current. Gel slabs containing the resolved proteins were fixed in 50% methanol 
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and 10% acetic acid. Polyacrylamide gels containing [^^S] methionine proteins were 
exposed to Kodakx-ray film following athirty-minute incubation in an Amplify™ solution 
(Amersham Inc.). 

Alternatively, proteins were transferred to nitrocellulose membrane in Tris- 
5 glycine buffer in the presence of 20% methanol forWestern blot analysis according to the 
procedure of Towbin et al. (Proc. Natl. Acad. Sci. USA 76:4350-4354, 1979). 
Nitrocellulose membrane was incubated in 5% skim milk/PBS prior to the addition of anti-a 
or anti-p tubulin monoclonal antibodies {0.5 |ig/ml in 3% BSA; Amersham, Inc.). Following 
several washes with PBS, the nitrocellulose membrane was incubated with goat anti- 
1 0 mouse peroxidase conjugated antibody and immunoreactive proteins were visualized by 
chemiluminescence using ECL method (Amersham Inc.). 

EXAMPLE 7 

Protein Purification and N-terminal Sequencing 

1 5 The 57 kDa associated protein was purified using a block of polypropylene 

rods with two high affinity binding peptides. Briefly, the peptide-fixed rods were incubated 
with total cell lysate as indicated above; however, in this case the carrier substance was 
gelatin (1%). The bound proteins were eluded in 100 mM phosphate buffer, pH 7.4 
containing 2% SDS and 0. 1 % p-mercaptoethanol. The eluted proteins were precipitated 

20 by mixing with 9 volumes of ice cold ethanol and incubated at -20 °C. Following a high 
speed centrifugation of the latter sample (15 minute centrifugation at 15,000 Xg, at4°C), 
the precipitated proteins were resuspended in 1 %SDS in PBS and mixed with equal 
volume of 2X SDS Laemmli sample buffer (Laemmli, U. K., 1 970, Nature 227:680-685). 
Protein samples were resolved on 1 0% SDS PAGE and transferred to PVDF membrane. 

25 The migration of the 57 kDa band was visualized by staining the PVDF membrane with 
ponseau S. The PVDF membrane containing the 57 kDa band was excised and submitted 
to the protein sequencing facility at the Biotechnology Service Centre in Toronto, Ontario. 
Amino acid sequencing of peptides was performed according to the method of Edman and 
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Begg (Eur. J. Biochem. 1:80-91 , 1967) using an applied biosystems Gas-Phase Model 
470A sequenator^"^ according to the procedure described by Flynn (Flynn et al., 1 983, 
Biochem. Biophys. Res. Commun. 117:859-65). 

EXAMPLE 8 
Identification of P-qp Interacting Proteins 

As explained above, P-gp is a tandemly duplicated molecule made up of two 
halves with each encoding for six transmembrane domains and an ATP binding domain. 
The two halves of P-gp are linked by a linker domain (Gros et al., 1986, Cell 47:371- 
380;Roninson et al.,1 986, Proc. Natl. Acad. Sci. USA 83:4538-4542). Of the 90 amino 
acids that make up the linker domain, 32 amino acid are either positively or negatively 
charged at physiological pH. While P-gp phosphorylation sites appearto have relevance 
to P-gp function, the function of the linker domain of P-gp remains unknown. To identify 
and dissect the role of this domain in MDR, the overlapping peptides method of the present 
invention was used. A novel approach was developed to isolate interacting proteins using 
overlapping synthetic hexapeptldes. The use of overlapping peptides to isolate interacting 
proteins allows the specific identification of interacting proteins and bypasses many of the 
problems associated with the use of random peptides. Figure 5 shows the amino acid 
sequences of the linker domain of P-gp 1 and P-gp 3. The two linker domains of P-gp1 
and P-gp3 share 41% amino acid sequence identity or 66% sequence homology. 
Overlapping hexapeptldes were synthesized in parallel on derivatized polypropylene rods 
as previously described (Georges etal., 1990, Proc. Natl. Acad. Sci. USA 87:152-156; 
Georges et al., 1993, J. Biol. Ghem. 268:1792-1798). 92 and 90 hexapeptldes were 
synthesized to cover the entire linker sequence of P-gpl and P-gp3, respectively. The 
hexapeptldes remain covalently attached to the polypropylene rods. 

To identify the interacting proteins with the various hexapeptldes of the linker 
domains, the peptide-fixed rods were incubated with total cell lysate from pS] methionine 
metabolically labeled GEM or CEMA/LB^° cells. After washing off non-specifically binding 
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1. I 

lysate proteins, the specifically bound proteins were eluded with SDS containing buffers 
and resolved on SDS PAGE. Figure 6 shows the proteins specifically bound to the 92 
overlapping hexapeptldes from P-gp1 linker sequence. Three regions in P-gp1 linker 
domain f ^^EKGIYFKLVTM^^y^ ^"SRSSLIRKRSTRRSVRGSQA^^^ and 
5 ®^^PVSFWRIMKLNLT^°^)bound a 57 kDa protein. The hexapeptides numbers 46-60, 81- 
89 and 5-9 (see Figure 5) bound with decreasing affinities to the 57 kDa protein (Figure 

6) . Moreover, peptides 46-60 showed binding to two other proteins with apparent 
molecular masses of 80 kDa and 30 kDa, however much weaker than that of 57 kDa. It is 
likely that the latter proteins (80 kDa and 30 kDa) are associated with the 57 kDa, since 

1 0 these proteins are detected when the intensity of the 57 kDa protein signal is high (Figure 
. . 6, peptides 50-56). Comparison of the amino acid sequences of the three 57 kDa binding 

O proteins did not reveal significant sequence homology among them to account for their 

O 

M binding to the same protein. Interestingly, however, the amino acid sequence of the second 

K region (peptides 46-60) encodes for protein kinase C consensus sequences (Chambers 
15 etal., 1993, J. Biol. Chem. 268:4592-4595). In addition, the third region (peptides 81 -89) 

3 was also shown to encode for a protein kinase A site (Glavy et al., 1 997, J. Biol. Chem. 

P 272:5909-5914). 

f\ To determine the affinity of binding between the sequences of the hexapeptides 

Q[ and the 57 kDa protein, it was of interest to determine the effects of high salt (0.3-2.4 M 

u. 

20 KCI), Zwitterionic detergent (1 0-1 60 mM CHAPS) and ionic detergents (0.1 %-2%SDS) on 
the interactions between the hexapeptides encoded by ^SRSSLIRKRSTRRSVRGSQA^ 
and the 57 kDa protein. Our results showthe binding to be stable to high salt, moderately 
stable to high concentrations of CHAPS, but sensitive to low concentrations of SDS (Figure 

7) . Given the stability of protein binding to covalently attached peptides, in the presence 
25 of 1 0 mM CHAPS, it was of interest to determine the binding of the hexapeptides from P- 

gpl linker domain to CHAPS soluble proteins that could include integral membrane 
proteins. The results in Figure 8 show bound proteins to the same overlapping 
hexapeptides that codes for the linker domain of P-gp 1 . Although the hexapeptides 



45 



numbers 46-60, 81 -89 and 5-9 (see Figure 5) bound to the 57 kDa protein (Figure 7) ; other 
proteins were found to interact with the same or different hexapeptides which did not bind 
proteins in the absence of 1 0 mM CHAPS. For example, hexapeptides 3-10 bound to ~ 
21 0 kDa protein that was not detected previously In the absence of CHAPS. Similarly, 
5 hexapeptides 1 6-20, which did not bind any proteins in the absence of CHAPS, bound to 
the same high molecular weight protein (Figure 7). Peptides 40-60 bound more strongly 
to several low molecule weight proteins (-45-25 kDa) in the presence of CHAPS. The 
hexapeptides 80-89 bound to two other proteins in addition to the 57 kDa protein. Taken 
together, the results in Figure 8 demonstrate that the binding between the various 

1 0 hexapeptides to the 57 kDa protein is resistant to mild zwitterionic detergents such as 
CHAPS. Moreover, the solubilization of membrane proteins in 10 mM CHAPS show 
binding to other proteins not seen in the absence or 1 0 mM CHAPS. One possibility Is that 
10 mM CHAPS allows integral membrane proteins to interact with the various 
hexapeptides of P-gp 1 linker domain. Alternatively, CHAPS exposes new domains that 

1 5 in turn allows for binding to hexapeptides of P-gp1 linker domain. In addition, some of the 
lower molecular weight proteins that bound to hexapeptides 40-60 and 80-89 may be 
degradation products of the 57 kDa protein (Figure 8). 

The P-gp gene family in man is encoded by two isoforms, P-gp 1 and P-gp 3 
(ormdr 1 and mdr3; (Childs et al., 1 994, Important Adv. Oncol., pp. 21-36)). However, as 

20 indicated earlier, only P-gp 1 confers an MDR phenotype. Moreover, although P-gp 1 and 
3 share about 80% amino acid sequence homology (Van der Bliek et al . , 1 987, The EMBO 
Journal 6:3325-3331); the linker domain is the most variable domain among the two 
isoforms with 66% amino acid sequence homology. To determine if the P-gp 3 linker 
domain binds to the same or different proteins, overlapping hexapeptides encoding P-gp 

25 3 linker domain were synthesized on polypropylene rods and their binding to soluble 
proteins was examined as Indicated above. Figure 9 shows the profile of binding proteins 
to the hexapeptides of P-gp 3. Interestingly, a similar molecular weight protein (57 kDa) 
also bound to the hexapeptides from P-gp 3. However, the binding to some hexapeptides 
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was different from that seen with P-gp 1 (Figure 6 versus Figure 9). For P-gp 3, three 
larger stretches of amino acids (^^^LiVIKKEGVYFKLVNM^2\ 
'""KAATRMAPNGWKSRLFRHSTQKNLKNS^^^ and '^^PVSFLKVLKLNKT^^) bound to 
the 57 kDa protein. The first and third regionsof P-gp 3 linker domain share considerable 
5 sequence identity with the first and third regions of P-gp 1 linker domain (Figure 10). 
Hence, it is not surprising that the same hexapeptides bound to the same protein. The 
second region of P-gp 1 and P-gp 3 linker domains are different (Figure 10). 
Consequently, although both the P-gp1 and P-gp3 sequences bound to a 57 kDa, the 
region of interaction between P-gp 3 and the 57 kDa protein is larger than that of P-gp 1 
1 0 (Figure 6 and Figure 9). A comparison of the amino acid sequences from P-gp 1 and P-gp 
3 binding hexapeptides is shown in Figure 10. 

EXAMPLE 9 

Purification and Sequencing of the 57 kPa Protein 

15 To determine the identity of the 57 kDa proteins, several copies of two 

hexapeptides (^^RSSLIR^^^ and ^^^SVRGSQ^^^) from the second region of P-gp 1 linker 
domain were synthesized. The latter hexapeptide sequences were those that bound with 
the highest affinity to the 57 kDa protein. Figure 1 1 shows the binding of these two 
peptides to total cell lysate from [^^S] methionine metabolically labeled cells. Both 

20 hexapeptides bound specifically to the 57 kDa protein and another protein of an apparent 
molecular mass of -41 kDa. Interestingly, longer incubation times ofthetotal cell lysate led 
toan increase in the level of the 41 kDa protein (Figure 1 1). Thus, the 41 kDa band is likely 
a degradation product of the 57 kDa protein. 

To purify the 57 kDa protein using the two hexapeptides, it was of interest to 

25 determine if other carrier proteins than BSA can be used. Figure 1 2 shows the effects of 
no blocking carrier, 1 % gelatin and 0.3% or 3% BSA on the binding of the hexapeptides 
to the 57 kDa protein. The results of this experiment were surprising in that no carrier 
protein was required to reduce the unspecific binding (Figure 12). The latter established 
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binding conditions were used to isolate large amounts of 57 kDa protein that bound to 
several copies of hexapeptides ^^^RSSLIR^^^ and ^^SVRGSQ^^^ Figure 13 shows 
purified 57 kDa protein on SDS-PAGE stained with Coomassie blue. The latter purified 
protein was transferred to PVDF membrane and stained with Ponceau S to localize the 
5 position of the 57 kDa protein. The Ponceau S stained band that migrated with the 
expected molecular mass was cut out and used for direct N-terminal sequencing (Flynn et 
al., 1983, Biochem. Blophys. Res. Commun. 117:859-65). The first seven rounds of 
Edman degradation showed two sequences of MREVISI and MREIVHI. These two 
sequences differed only by three amino acids (VIS instead of IVH). Comparison of the two 

1 0 sequence with known protein sequences using FastA protein search engine, showed the 
latter sequences to encode the first seven N-terminal amino acids of a- and p-tubulins. The 
identification of tubulins, as the 57 kDa protein was consistent with the apparent molecular 
mass and the potential degradation products that were obsen/ed following long incubation 
periods. To further confirm the Identity of the 57 kDa protein as tubulins, Western blot 

15 analysis was performed on hexapeptide-bound 57 kDa protein and total cell lysate 
resolved on SDS PAGE and transferred to nitrocellulose membrane. The nitrocellulose 
membrane was then probed with anti a-tubulin and anti-p-tubulin monoclonal antibodies, 
respectively. Figure 14 showsthe results of the Western blot analysis. Consistent with the 
sequencing results, both tubulin subunits (a and p) were recognized in the lanes containing 

20 the hexapeptide bound proteins. Thus, establishing the identity of the 57 kDa protein as 
a and p-tubulin. 

EXAMPLE 10 

The power of the overlapping peptide spanning method invention was thus 
25 validated with P-gp. As shown above, the overlapping peptide-based method of the 
present invention provides the proof of principle to the hypothesis which states that the 
region between two interacting proteins consists of high affinity binding sequences and 
repulsive sequences as well as the fact that such a method can be used efficiently and 
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successfully to identify and characterize domains and sequences of interacting proteins. 
The balance of high affinity and repulsive forces determine whether two proteins will form 
stable complex. The useof short overlapping peptides allows the identification of such high 
affinity binding sequences between baitand preyproteins. The rationale for using short 
5 and overlapping peptides to isolate high affinity binding sequences is essential to the 
success and efficiency of the proof of the principle described herein. For instance, larger 
peptides could contain both high affinity and repulsive binding sequences in one peptide 
sequence such that the net force of interaction is negative. Moreover, the use of 
overlapping peptides that differ in one amino acid from the previous or next peptide 

10 reduces the possibility of unspecific binding. Thus, overlapping peptides often 
demonstrate a peak in the binding affinity of various peptides (see Figures 7 and 4). The 
skilled artisan will understand that longer overlapping peptides could also be used. 
Unfortunately, such larger peptides increase the risk of missing the identification of 
interacting proteins due to a change in the balance between high-affinity and repulsive 

15 amino acids. 

The binding of 57 kDa protein to three different regions in P-gp1 and P-gp3 
linker domains is consistent with the herein proposed hypothesis to explain protein 
interactions (see principle of protein-protein interactions). The high affinity binding 
domains vary in sizes from 1 0 -26 amino acids In length. In the case of P-gp1 and P-gp3 

20 linker domains, two of the three high affinity binding domains shared considerable 
sequence identity. The third high affinity binding region of the linker domains 
(^^^SRSSLIRKRSTRRSVRGSQA^"'^ versus 
^KAATRMAPNGWKSRLFRHSTQKNLKNS'^'*) shared no homology in their primary 
amino acid sequence. However, helical wheel presentation of these two domains showa 

25 cluster of positively charged residues on one face of the helix while a cluster of 
serine/threonine residues on the other side (see Figure 1 5). Interestingly, the region of 
highest binding affinity to the 57 kDa protein encodes the three putative phosphorylation 
sites in P-gp 1 (Chambers et al., 1 993, J. Biol. Chem. 268:4592-4595). The positions of 
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the phosphorylation sites in P-gp3 have not being determined experimentally, howeverthey 
encode forthe consensus sequence of protein kinase C. In this respect, it is possible that 
P-gp1 and P-gp3 interactions at the linker domains is modulated by phosphorylation of this 
domain. Thus, although mutations of P-gp phosphorylation sites within the linker domain 
were shown not to affect its drug transport function (Germann etal., 1996, J. Biol. Chem. 
271:1708-16), other proposed functions of P-gpl {e.g., regulator of endogenous chloride 
channel) was shown to be affected by its phosphorylation state (Gill et al., 1 992, Cell 71:23- 
32; Valverde et al., 1992, Nature 355:830-833). Indeed, a member of the ABC 
transporters, CFTR (the cystic fibrous transmembrane regulator), which encodes a similar 
linker domain was found to co-localize with the microtubule network (Tousson et al. , 1 996, 
J. Cell. Sci. 109:1325-34). Furthermore, microtubule-dependent acute recruitment of 
CFTR to the apical plasma membrane of T84 cells was responsive to elevations in 
intracellular cAMP and phosphorylation of the linker domain (Tousson etal., 1996, J. Cell. 
Sci. 109: 1325-34). Taken together, although it is not clear if phosphorylation plays a role 
in modulating P-gp functions in atubulin dependent manner, given the co-localization of P- 
gpl phosphorylation and binding to tubulin, such a possibility is likely. Work is progress 
to determine if phosphorylated hexapeptldes bind to tubulin using the assay described 
herein. Thus, the present invention opens the door to the validation of a physiologically 
relevant interaction between proteinaceous domains. 

The possibility that the 57 kDa protein binds to the polypropylene rods or their 
derivatized moieties is unlikely since all other rods which are similarly derivatized did not 
bind the 57 kDa protein. Moreover, hexapeptides synthesized on at least four different 
times bound to the same proteins. Finally, hexapeptides encoding the first and third high 
affinity binding regions of the linker domains of P-gpl and P-gp3 bound to the 57 kDa 
protein. In addition to the 57 kDa protein, other proteins with apparent molecular masses 
of -80 kDa and 30 kDa also bound to some of the hexapeptides in the linker domains. 
However, the binding of these proteins was much weaker than the 57 kDa and may be 
associated proteins. Although direct measurements of binding affinities between the 
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various hexapeptides and the 57 kDa protein have not been done, it is interesting that this 
interaction is resistant to 1 0 mlVl CHAPS and high salt. Moreover, the presence of 1 0 mM 
CHAPS in the incubation mixlead to the binding of other proteins (most notably the -21 0 
kDa protein) to several stretches of hexapeptides which did not bind in the absence of 1 0 
5 mM CHAPS. The binding of the latter proteins to the hexapeptides 1 5 - 28 are likely due 
to the extraction of proteins from the membranous material which were excluded in the 
absence of CHAPS. In absence of CHAPS, the cell lysate contained soluble proteins and 
membrane associated proteins only. 

The physiological significance of P-gpl or P-gp3 binding to tubulin is not clear. 
1 0 However, tubulin has been shown to interact with several membrane proteins (Glustetto et 
^. al., 1998, J. Comp. Neurol. 395:231-244: Hagaetal., 1988, Eur. J. Biochem. 255:363-368: 

C| Perrot-Applanatetal., 1995, J. Cell. Sci. 108:2037-2051; Ravindra, R., 1997, Endocrine 

Q 

Lj, 7:127-143). P-gpl or P-gp3 interactions with tubulin and possibly microtubules maybe an 

jjli example of the membrane-skeleton fence model (Jacobson et a!., 1995, Science 

J 15 268:1441-1442). In this model, a small fraction of membrane receptors seem to be fixed 

^ to the underlying cytoskeleton (Sake et al., 1995, J. Cell. Biol. 129:1559-1574). It is 

^ interesting in this respect that increase in the stability and expression of P-gp in rate liver 

j^'^ tumors in vivo are associated with similar increases in the stability of several cytoskeleton 

Gi proteins, including a-tubulin, p-actin, and cytokeratins 8/18 (Lee et al., 1998, J. Cell. 

20 Physiol. 177:1-12). Work is in progress to determine the functional significance of P-gp 
interactions with tubulin in vivo. 
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EXAMPLE 11 

The Overlapping Peptides Spanning Method is not Limited to Pqp-lnteractinq 

Proteins 

The overlapping peptide approach of the present invention has been further 
5 validated with Annexin I, a soluble and nnembrane associated protein, as opposed to P- 
glycoprotein, a strictly transmembrane protein. Annexin is thus structurally and functionally 
different from P-glycoprotein. 

Using this approach, several proteins that interact with Annexin I and the 
precise amino acid sequences of Annexin I, which mediate these interactions were 
10 identified. Annexin I is a member of a large family of Intracellular soluble and membrane 
associated proteins that bind phospholipids in a reversible and calcium-dependent 
C3i manner. Various members of the Annexin family have been implicated in a number of 
Q different intracellular processes including vesicular trafficking, membrane fusion exocytosis, 
f signal transduction, and ion channel formation and drug resistance. Given the many 
Q 15 possible physiological functions of Annexin I, the method ofthe present invention was set 
out to identify its interacting proteins and the precise amino acid sequences that mediate 
Annexin I interactions thereto. 
W Briefly, as described earlier, overlapping peptides corresponding to the entire 

p, amino acid sequence of Annexin I (total of -340 peptides plus controls) were synthesized 
20 on a solid support as described above. In this case, overlapping heptapeptides, as 
opposed to hexapeptides were used. The peptides were then incubated with total cellular 
proteins isolated from MCF7 breast tumorcellsthatwere metabolically labeled with [35S] 
methionine. Following several washes, the bound proteins were eluted and resolved on 
SDS-PAGE as outlined above. The results are consistent with previous results with P- 
25 glycoprotein, as the method leads to the identification of several islands of Annexin I amino 
acid sequences (data not shown) which interacted with five proteins ranging in molecular 
masses from 1 0 kDa to 200 kDa (specif ical ly, ~ 1 0 kDa; -29 kDa; -85 kDa; - 1 06 kDa and 
-200 kDa). Briefly, 8 interacting domains having high affinity for the cellular proteins of the 
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extract were identified. Two of these high-affinity islands were located in the tail domain 
of Annexin (residues 1 -36) and 6 in the a helical bundles of Annexin i (residues 37-to the 
end; see for example WO 99/21980). The identity of the latter interacting proteins is 
presently under study. However, the interaction of a 10 kDa protein with Annexin I is 
consistent with earlier works which demonstrated a direct interaction between Annexin I 
and S100C protein (Mailliard et a!., 1996, J. Biol. Chem. 271: 719-725). 

Thus, the present invention is shown to enable the simple and efficient 
identification of high-affinity protein interaction as well as enabling the simultaneous 
identification of the precise amino acid sequence of at least one of the interacting partners. 

CONCLUSIONS 

In conclusion, a simple approach to identify P-gp interacting proteins from a 
total cell lysate has been used. Moreover, this approach allows for the identification of the 
precise amino acid sequences in P-gpl and P-gp3 linker domains that mediate the protein 
interactions with tubulins. In addition, knowledge of the high-affinity binding sequences 
allow for the subsequent purification of the interacting proteins from a total mixture of 
cellular proteins, as further exemplified with Annexin I. Indeed, given the simplicity of this 
approach to study protein-protein interactions, it is easily applied to other proteins. Finally, 
our approach is rapid and has several advantages over other currently used approaches. 

Although the present invention has been described hereinabove by way of 
preferred embodiments thereof, it can be modified, without departing from the spirit and 
nature of the subject invention as defined in the appended claims. 
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